WO2022083119A1 - 一种资源配置方法、介质及服务端 - Google Patents

一种资源配置方法、介质及服务端 Download PDF

Info

Publication number
WO2022083119A1
WO2022083119A1 PCT/CN2021/095765 CN2021095765W WO2022083119A1 WO 2022083119 A1 WO2022083119 A1 WO 2022083119A1 CN 2021095765 W CN2021095765 W CN 2021095765W WO 2022083119 A1 WO2022083119 A1 WO 2022083119A1
Authority
WO
WIPO (PCT)
Prior art keywords
operator
data processing
task
resource configuration
processing model
Prior art date
Application number
PCT/CN2021/095765
Other languages
English (en)
French (fr)
Inventor
刘子汉
冷静文
陆冠东
陈�全
李超
过敏意
Original Assignee
上海交通大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海交通大学 filed Critical 上海交通大学
Publication of WO2022083119A1 publication Critical patent/WO2022083119A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues

Definitions

  • the present invention relates to a resource allocation method, in particular to a resource allocation method, a medium and a server.
  • the purpose of the present invention is to provide a resource configuration method, a medium and a server, which are used to solve the problem that the existing resource configuration method mainly optimizes the performance of a single data processing model, which is difficult to apply to Problems in complex scenarios with multiple data processing models.
  • a first aspect of the present invention provides a resource configuration method; the resource configuration method is applied to a server of a multi-core architecture, including: acquiring a task that can be executed by the server as a first task Obtain the data processing model corresponding to the first task as the first data processing model, wherein each of the first data processing models includes at least one operator; for each operator in the first data processing model Perform resource configuration to obtain the number of resources used by each operator in the first data processing model; when the server receives a user's task request, obtain a second task; wherein, the second task includes all The current task of the server and the task corresponding to the task request of the user; when the number of the second tasks is greater than 1, a collaborative resource configuration sub-method is executed; wherein, the collaborative resource configuration sub-method includes: obtaining the The data processing model corresponding to the second task is used as the second data processing model; the quantity of resources used by each operator in the second data processing model is obtained according to the quantity of resources used by each operator
  • the method for obtaining the number of resources used by the operator includes: configuring different amounts of resources for the operator respectively , and obtain the operator performance corresponding to each configuration; according to the operator performance corresponding to each configuration, obtain the number of resources used by the operator.
  • the resource configuration method further includes: according to the number of resources used by each operator in the first data processing model, for the operators in the first data processing model. Perform operator fusion and/or operator segmentation.
  • an implementation method for acquiring the scheduling order of each operator in the second data processing model includes: acquiring a performance model of each operator in the second data processing model; wherein, The performance model includes the execution time of the operator; obtains the service quality requirements of each of the second tasks; according to the service quality requirements of each of the second tasks and the performance model of each operator in the second data processing model, Obtain the scheduling order of each operator in the second data processing model.
  • the cooperative resource configuration sub-method further includes: acquiring the data in the second data processing model. An interference model between operators; according to the interference model, the scheduling sequence and parallel execution state of each operator in the second data processing model are adjusted.
  • the cooperative resource configuration sub-method further includes: according to the second data The number of resources used by each operator in the processing model, the scheduling order and the parallel execution state, to obtain the resource usage status of the server; according to the resource usage status of the server, at least one operator in the second data processing model is processed The amount of resources used by the child is adjusted.
  • the implementation method for acquiring the second task includes: stopping the currently executing resource configuration scheme; , obtain unfinished tasks and sub-tasks; take the tasks, the unfinished tasks and sub-tasks corresponding to the user's task request as the second task.
  • the resource configuration method performs resource configuration in units of cores of the server.
  • a second aspect of the present invention provides a computer-readable storage medium on which a computer program is stored; when the computer program is executed by a processor, the resource configuration method described in the first aspect of the present invention is implemented.
  • a third aspect of the present invention provides a server; the server has a multi-core architecture, and the server includes: a memory, which stores a computer program; a processor, which is communicatively connected to the memory, and when the computer program is invoked Execute the resource configuration method according to the first aspect of the present invention; a display is connected in communication with the processor and the memory, and is used for displaying the relevant GUI interaction interface of the resource configuration method.
  • the resource configuration method can obtain the second data processing model corresponding to each second task, and obtain each second data processing model
  • the scheduling sequence and parallel execution state of the operators are used as a basis, and the resources of the server are allocated to all operators in the second data processing model. It can be seen from this that the resource configuration method of the present invention can be applied to complex scenarios of multiple data processing models.
  • FIG. 1 shows task information and files required for the task acquired in a specific embodiment of the resource configuration method according to the embodiment of the present invention.
  • FIG. 2A shows a flowchart of the resource allocation method according to the present invention in a specific embodiment.
  • FIG. 2B is a flowchart of step S15 in a specific embodiment of the resource allocation method of the present invention.
  • FIG. 2C is a diagram showing an example of a resource allocation scheme obtained in a specific embodiment of the resource allocation method of the present invention.
  • FIG. 3 is a flowchart of step S13 in a specific embodiment of the resource allocation method according to the present invention.
  • FIG. 4 is an exemplary diagram of performing operator fusion and operator segmentation in a specific embodiment of the resource allocation method according to the present invention.
  • FIG. 5 shows a flow chart of obtaining an operator scheduling sequence in a specific embodiment of the resource allocation method of the present invention.
  • FIG. 6A shows a flowchart of adjusting the scheduling sequence and parallel execution state of operators in a specific embodiment of the resource allocation method of the present invention.
  • FIG. 6B shows a flowchart of obtaining an interference model in a specific embodiment of the resource allocation method of the present invention.
  • FIG. 7A shows a flow chart of adjusting the amount of resources used by an operator in a specific embodiment of the resource allocation method of the present invention.
  • FIG. 7B shows an example diagram of adjusting the number of resources used by an operator in a specific embodiment of the resource allocation method of the present invention.
  • FIG. 8 is a flowchart of step S14 in a specific embodiment of the resource allocation method of the present invention.
  • FIG. 9 shows a flowchart of the resource allocation method according to the present invention in a specific embodiment.
  • FIG. 10 is a schematic structural diagram of the server according to the present invention in a specific embodiment.
  • each task request corresponds to a task
  • each task includes multiple subtasks
  • each subtask corresponds to one or more data processing models and task files
  • Each data processing model contains one or more operators.
  • task 1 it corresponds to two subtasks of target detection and target tracking, and the target detection subtask corresponds to the YOLO-V3 model, and the target tracking corresponds to the GOTURN model; the YOLO-V3 model and the GOTURN model both contain Multiple operators, such as convolution operator, pooling operator, fully connected operator, etc.
  • the present invention provides a resource configuration method, which is applied to a server of a multi-core architecture. Specifically, when the server needs to perform two or more second tasks according to the user's request, the resource configuration method can obtain the second data processing model corresponding to each second task, and obtain each second data processing model. The scheduling sequence and parallel execution state of the operators in the data processing model, and based on this, the resources of the server are allocated to all operators in the second data processing model. It can be seen from this that the resource configuration method of the present invention can be applied to complex scenarios of multiple data processing models.
  • the resource configuration method is applied to a server of a multi-core architecture.
  • the resource configuration method includes:
  • S11 Acquire a task that can be performed by the server as a first task; wherein, the first task corresponds to a service that the server can provide for a user. Since the server provides the user with a set of predetermined services, the tasks that can be performed by the server can be directly obtained according to the predetermined services, and then the first task can be obtained. For example, for a certain server, if the services that the server can provide for users include target detection and tracking services and map construction services, the tasks that the server can perform include target detection and tracking tasks and map construction tasks.
  • step S12 Acquire a data processing model corresponding to the first task as a first data processing model, wherein each of the first data processing models includes at least one operator.
  • the scenario contains a large amount of task information
  • the task information is a priori information, that is: received at the server side Information that is available before the user requests it.
  • the task information is, for example, the logical structure of the first task, the structure of the model, the type and parameter of the operator, etc. Therefore, step S12 can obtain the first data processing model according to the task information. For example, referring to FIG.
  • the task information of task 2 includes: the name of the task is map construction, and its logical structure includes visual odometry, map reconstruction, and loop closure detection,
  • Its data processing model includes DeepVO, CNN-SLAM and SDA-based, and according to the data processing model, the types and parameters of the operators it contains can be directly obtained.
  • the resources of the server can be storage resources, computing resources, etc.
  • the resource allocation algorithm configures the resources of the server in units of cores.
  • the number of resources used by each operator can be calculated by each operator.
  • the number of cores used by the operator is represented; for example, the number of resources used by operator 1 is 8 cores, the number of resources used by operator 2 is 16 cores, and so on.
  • the above-mentioned steps S11-S13 are usually executed before running (compiling period), and after the above-mentioned steps S11-S13 are executed, the server can start to provide services for the user, and thereafter, the user can send a task request. The server performs the corresponding task.
  • the server when the server receives the task request of the user, obtain a second task; wherein, the second task includes the current task of the server and the task corresponding to the task request of the user.
  • the current task of the server includes tasks that the server is executing and tasks that the server has not yet executed. Therefore, the second task includes: the task corresponding to the user's task request, the task currently being performed by the server. Executed tasks and tasks that have not been executed by the server.
  • the user's task request may originate from the same user, or may originate from multiple users. Particularly, when the server is currently in an idle state, the second task only includes a task corresponding to the user's task request.
  • the method for configuring the resources of the server includes: acquiring the second task The operators included in the corresponding data processing model; for any operator, the corresponding number of resources in the server are allocated to the operator according to the number of resources used by the operator.
  • the cooperative resource configuration sub-method described in this embodiment includes:
  • S152 Acquire the number of resources used by each operator in the second data processing model according to the number of resources used by each operator in the first data processing model. Since the second task is a task that the user requests the server to perform, and the first task is a task that the server can perform, each second task is included in the first task, so : According to the number of resources used by each operator in the first data processing model corresponding to the first task, the number of resources used by each operator in the second data processing model corresponding to the second task can be obtained.
  • S153 Acquire the scheduling sequence and parallel execution state of each operator in the second data processing model. Specifically, limited by the interdependence between operators and the maximum available resources of the server, some operators in the second data processing model must be executed sequentially in time, such as the operators in FIG. 2C .
  • OP-1 and operator OP-4 the scheduling sequence of the operators is used to describe the sequential execution sequence of the operators.
  • the server can also execute (ie: execute simultaneously) two or more operators in parallel to improve performance, such as operators OP-1 and OP- in FIG. 2C . 2 and OP-3: For a certain operator, its parallel execution status is used to describe whether the operator can be executed in parallel with other operators, and/or the number and name of the operators executed in parallel with the operator.
  • step S154 Configure the resources of the server according to the number of resources used by each operator in the second data processing model, the scheduling order and the parallel execution state, so as to generate a resource configuration scheme of the server. Specifically, after the scheduling sequence and the parallel execution state of each operator are determined, and in combination with the number of resources used by each operator, step S153 can realize the configuration of the resources of the server.
  • the execution order of operator OP-1 is executed first , OP-2 and OP-3, and then execute operator OP-4, and operators OP-1, OP-2 and OP-3 can be executed in parallel; if the total number of cores of the server is 32, then according to the above information, The server can allocate 8 cores to operators OP-1, OP-2 and OP-3 at the same time. After the operators OP-1, OP-2 and OP-3 are all executed, the server allocates 32 cores. Give the operator OP-4 to complete the configuration of its resources.
  • the resource configuration method can obtain the second data processing model corresponding to each second task, and obtain each second data processing model
  • the scheduling sequence and parallel execution state of the operators are used as a basis, and the resources of the server are allocated to all operators in the second data processing model. It can be seen from this that the resource configuration method of the present invention can be applied to complex scenarios of multiple data processing models.
  • the above steps S11-S13 implement the resource allocation of operators in a single data processing model.
  • the interaction between different tasks can be ignored, so this process can be regarded as a single-task resource allocation stage .
  • Steps S14-S15 implement the resource configuration of operators in at least two data processing models.
  • the interaction between different tasks needs to be considered, so this process can be regarded as a multi-task collaborative resource configuration stage.
  • the multi-data processing model brings greater challenges to data reuse, preemption of shared resources with operator granularity, and service running order.
  • the resource configuration method uses the model as the granularity to configure resources, so that more prior information can be obtained, thereby ensuring that the resource configuration method can obtain a better service guarantee rate and lower energy consumption.
  • some embodiments mainly perform resource configuration for heterogeneous clusters such as CPU+GPU/CPU+TPU.
  • This configuration method mainly focuses on GPU/TPU.
  • the resource configuration method described in this embodiment can configure the cores of the server based on the scheduling sequence and parallel execution state between operators, so that the factors of core allocation and operator pipeline execution can be fully considered. , therefore, the resource configuration method described in this embodiment can be applied to a server of a multi-core architecture.
  • an implementation method for obtaining the number of resources used by any operator in the first data processing model by using a performance test method includes:
  • S131 configure different amounts of resources for the operator respectively, and acquire the operator performance corresponding to each configuration.
  • the operator performance needs to take into account the execution time of the operator and the number of resources used by the operator, so as to minimize the average resource occupation under the condition of satisfying the quality of service.
  • the operator performance can be determined by the execution of the operator.
  • the product of time and the number of resources used by the operator is described.
  • the operator performance corresponding to each configuration can be obtained by actually executing the operator in this configuration.
  • S132 according to the operator performance corresponding to each configuration, obtain the number of resources used by the operator.
  • a configuration with the best operator performance is selected from all configurations of the operator, and the number of resources corresponding to the configuration is used as the number of resources used by the operator. For example, if the operator has the best performance when the operator is configured with 8 cores, the number of resources used by the operator is 8 cores.
  • the performance test method is used to obtain the amount of resources used by any operator in the first data processing model. This method can ensure that each operator adopts the most economical resource allocation configuration, and not only enables the operator to To achieve acceptable performance while reducing resource usage as much as possible, so as to provide as many available resources as possible for other operators.
  • the resource allocation method further includes: according to the number of resources used by each operator in the first data processing model.
  • Graph-level optimization is performed on each operator in the first data processing model, wherein the graph-level optimization includes operator fusion and/or operator segmentation.
  • the operator fusion refers to the fusion of several consecutive operators in the first data processing model that use fewer resources and belong to the same task under the restriction of the maximum available resources of the server; Among them, at least two operators can be fused into one operator through operator fusion.
  • the operator fusion can increase parallelism and reduce memory access overhead through pipelined execution.
  • the operators Convolution 1, Pooling 1, Normalization 1, and Activation 1 are consecutive, 4 operators that use 8 kernels and belong to the same task, can be optimized by the graph level by Operator fusion fuses it into one operator, which uses 16 cores for operation; it can be seen that the degree of parallelism can be increased and the memory access overhead can be reduced through operator fusion.
  • the operator splitting refers to splitting an operator with a longer running time into two or more operators under the condition that the maximum available resources of the server are limited and the performance is not affected. Due to the smaller scheduling granularity, the split operator can provide higher flexibility in the multi-cooperative resource configuration stage.
  • the operator convolution 2 is an operator that uses 16 kernels and has a long running time. During the operator segmentation process, the convolution 2 can be used according to the kernel usage of the server. The operators are divided into those using 16 kernels (Convolution 2a) and those using 32 kernels (Convolution 2b).
  • this embodiment can provide higher flexibility and higher parallelism and smaller memory access overhead; and, by combining the operator fusion and the operator segmentation, the hardware idling and resource waste caused by running time and resource usage can be effectively reduced in the process of resource configuration. , and can effectively fill the wasted part of hardware resources to improve performance.
  • an implementation method for obtaining the scheduling order of each operator in the second data processing model includes:
  • S51 Acquire a performance model of each operator in the second data processing model; wherein, the performance model includes the execution time of the operator, which may be acquired in a performance test-based manner. Specifically, during compilation, each operator in the second data processing model is run in different configurations, and a performance model of the operator is constructed according to parameters such as the execution time of the operator. In addition, in actual operation, the performance model may also be updated in real time according to the actual operation of the operator, so as to improve the accuracy of the performance model, thereby obtaining a more optimized resource configuration.
  • S52 Acquire the service quality requirements of each of the second tasks.
  • the service quality requirements of each of the second tasks may be obtained from the user's task request.
  • S53 Acquire the scheduling order of each operator in the second data processing model according to the service quality requirements of each of the second tasks and the performance model of each operator in the second data processing model. For example, the execution of an operator in a task requiring a lower quality of service may be appropriately delayed, so as to preferentially provide the server's resources for use by an operator in a task requiring a higher quality of service.
  • the scheduling sequence of each operator in the second data processing model may also be comprehensively considered according to the execution time of the operator and the service quality requirement of the second task.
  • the cooperative resource configuration sub-method after acquiring the parallel execution state of each operator in the second data processing model, the cooperative resource configuration sub-method further includes:
  • the method for constructing the interference model between the operators includes quantifying shared resource requirements, performance testing, and constructing an analysis model.
  • the quantifying the demand for shared resources refers to quantifying the shared resources demanded among multiple operators, and the shared resources are, for example, cache, bandwidth, and the like.
  • the operator parameters may be randomly generated and/or the operator parameters of a common network can be used to test the performance of the operators (that is, using multiple operator parameters to execute the operators respectively to obtain different operators) The performance corresponding to the parameters), so as to obtain the interference between operators.
  • a linear regression model, a neural network model, etc. may be used to model the interference between the operators, so as to obtain the interference between the operators in the second data processing model Model.
  • step S62 adjusts the scheduling sequence and parallel execution state of each operator in the second data processing model. Specifically, for two operators executed in parallel, if the interference between the two operators is relatively large, step S62 adjusts the two operators to be executed serially, and executes them in series according to the performance models of the two operators and/or The scheduling order of these two operators is adjusted according to the quality of service requirements. For example, if operator 1 and operator 2 are executed in parallel, according to the interference model, it is known that the interference between operator 1 and operator 2 is relatively large. At this time, it is necessary to change operator 1 and operator 2 to For serial execution, and also need to re-determine the scheduling order between operator 1 and operator 2.
  • This embodiment can adjust the scheduling order and parallel execution state of each operator in the second data processing model according to the interference model, so that the resource allocation method can reduce or even eliminate the interference between operators in different tasks.
  • the interference introduced by parallel execution is beneficial to improve the accuracy of resource allocation.
  • the cooperative resource configuration sub-method of this embodiment further includes:
  • S71 Acquire the resource usage status of the server according to the number of resources used by each operator in the second data processing model, the scheduling order, and the parallel execution status.
  • the resource usage status of the server includes whether the server has idle resources and the number of idle resources. For example, in Fig. 7B, before the adjustment, since the operators OP-1, OP-2 and OP-3 are executed in parallel, and all three are executed in series with the operator OP-4, at this time, the operators OP-1, OP-2 and OP-4 are executed in parallel. During the execution of OP-2 and OP-3, there are 8 idle cores on the server.
  • step S72 Adjust the number of resources used by one or more operators in the second data processing model according to the resource usage status of the server. Specifically, if there is an idle resource on the server at a certain moment, step S72 allocates the idle resource to one or more operators being executed at that moment. As shown in FIG. 7B , during the execution of operators OP-1, OP-2 and OP-3, there are 8 idle cores on the server. At this time, step S72 can set the idle 8 cores on the server. cores are allocated to the operator OP-1.
  • the amount of resources used by one or more operators in the second data processing model can be adjusted according to the resource usage status of the server, so as to reduce or even eliminate the resource occupation and scheduling sequence as much as possible. The resulting waste of hardware resources.
  • the server needs to Respond to task requests sent by users in real time at runtime.
  • the server receives the task request of the user, it obtains the current task of the server and the task corresponding to the task request as the task request.
  • the implementation method of the second task includes:
  • the currently executing resource configuration scheme is a resource configuration scheme generated according to a task request previously received by the server.
  • the resource configuration method discards the original resource configuration scheme before the moment.
  • S142 Acquire unfinished tasks and subtasks from the current tasks of the server.
  • the current tasks of the server include tasks that are being executed by the server and tasks that have not yet been executed; Refers to the subtasks that have not been executed or have not started to be executed in the tasks being executed by the server.
  • the resource configuration method is mainly oriented to the scenario of a server with a multi-core architecture
  • the task structure (including subtasks and logical dependencies in the task) can be determined once the service type is determined. Therefore, this embodiment can respond to the user's task request in real time during the running process of the server.
  • the resource configuration method further includes: when the code is generated and compiled into an executable session, configuring a specific deep learning dedicated chip programming model. In order to support JIT and dynamically adjust resource configuration, scheduling order, and parallel execution state according to the actual operator operation.
  • the resource configuration method performs resource configuration in units of cores of the server.
  • the resource configured to each operator by the resource configuration method is an integer number of cores.
  • the resource configuration method is applied to a server of a multi-core architecture, and configures the resources of the server in units of cores.
  • the resource configuration method described in this embodiment is oriented to cloud service scenarios in the case of multiple users and multiple data processing models, and in these scenarios, the request granularity is larger and contains a lot of prior information. Therefore, this embodiment can make full use of the prior information, wherein the prior information is, for example, the logical structure of the task, the structure of the model in the task, the type and parameters of the operator, etc. Optimize and model the interference between operators, so that when multiple service requests arrive, a better resource allocation scheme is generated than independently optimizing several data processing models.
  • the resource configuration method described in this embodiment includes:
  • S92 Acquire task information of the first task and files required for the task. Wherein, the task information and the files required for the task are provided as a priori information during compilation.
  • S93 Acquire a first data processing model corresponding to the first task, and translate it into a unified intermediate expression.
  • the server supports multiple data processing models, so it is necessary to convert all data processing models into a unified intermediate model.
  • S94 Perform single-task resource configuration for each of the first tasks.
  • the single-task resource configuration process may be executed before running (compile time).
  • the goal of the single-task resource configuration is to minimize the average resource occupation under the condition of satisfying the service quality.
  • the single-task resource configuration method performs graph-level optimization of the model (including operator fusion and operator splitting) for the number of cores used by each operator in the multi-core architecture optimization model, and generates an optimized resource configuration It is used in the multi-model collaborative resource configuration stage.
  • the construction of the performance model of the operator in the first data processing model is also completed in the single-task resource configuration phase, for use in the multi-model collaborative resource configuration phase.
  • step S94 Perform multi-task cooperative resource configuration on the second task.
  • step S94 can obtain the optimal resource configuration of each data processing model under the condition that the service quality requirements are met.
  • the single-task resource configuration obtained in step S94 is combined to generate an overall resource configuration, so as to ensure that as many tasks as possible meet the service quality requirements.
  • step S99 dynamically adjust the performance model of the operator obtained in step S94 according to feedback information such as the execution time and whether the service quality requirements are met during the execution process or after the execution is completed, so as to achieve a more optimal resource configuration.
  • the resource allocation method described in this embodiment adopts the following method to process: when a new task request arrives, the resource allocation scheme of all the original tasks before the moment is discarded, and the original resource allocation scheme is discarded. The remaining part of the task is combined with the new task request for collaborative optimization, resulting in a resource allocation plan.
  • step S94 can provide higher flexibility, higher parallelism and lower memory access overhead in the single-task configuration stage. Specifically, after determining the resource configuration of each operator, step S94 searches for successive operators that use fewer cores for fusion to increase the degree of parallelism through pipelined execution and maximize the performance without affecting performance. It is possible to split operators with longer running time into operators that occupy less cores. Since the split operator scheduling granularity is smaller, it can provide higher flexibility for the multi-cooperative resource configuration stage. Therefore, through the graph-level optimization, the wasted part of the referral resources can be effectively filled, thereby improving the performance of resource allocation.
  • this embodiment adopts the method of quantification-performance test shown in steps S61 to S62 to construct an analysis model, and determines the parallel operation mode of several models by analyzing the interference between different types, parameters and operators. to minimize the impact of interference.
  • the multi-task cooperative resource configuration can also fully consider the service quality requirements provided by users, and reasonably adjust the running order of operators in different models, so as to maximize the satisfaction of service quality requirements.
  • the target of the task ratio can also fully consider the service quality requirements provided by users, and reasonably adjust the running order of operators in different models, so as to maximize the satisfaction of service quality requirements.
  • the present invention also provides a computer-readable storage medium on which a computer program is stored; when the computer program is executed by a processor, the resource configuration method of the present invention is implemented.
  • the present invention further provides a server, and the server is a multi-core architecture.
  • the server 100 includes: a memory 110, which stores a computer program; a processor 120, which is communicatively connected to the memory 110, and executes the present invention when the computer program is invoked
  • the resource allocation method the display 130 is connected in communication with the memory 110 and the processor 120, and is used for displaying the relevant GUI interactive interface of the resource allocation method.
  • the protection scope of the resource allocation method of the present invention is not limited to the execution order of the steps listed in this embodiment, and all the solutions implemented by adding and subtracting steps and replacing steps in the prior art based on the principles of the present invention are included in the present invention. within the scope of protection.
  • the present invention provides a resource configuration method, medium and electronic device.
  • the resource allocation method is an optimization method for compiling and resource allocation of a multi-core architecture deep learning special-purpose chip for multi-user and multi-data processing model scenarios, and can combine the information of multiple data processing models in real time in various scenarios to pair resources.
  • the configuration is collaboratively optimized across models to improve overall service quality, meet service rates, and minimize system energy consumption.
  • the resource configuration method can sequentially optimize single-task resource configuration and multi-task resource configuration in combination with hardware features after multiple users provide task information, service requirements, and files required for the task, and finally generate an executable service session as: User provides services.
  • the existing methods for service scheduling research mainly aim at maximizing the satisfaction rate of delay-sensitive services and maximizing hardware utilization, and design real-time and low-cost scheduling algorithms for resource sharing between services and task running order.
  • Schedule Different from the existing method, the present invention considers that when providing services for multiple users and multiple tasks, it is necessary to satisfy the user's requirements for tasks as much as possible, and this goal is the most intuitive goal.
  • the present invention converts this goal into a dual goal, namely: under the condition of satisfying the service quality requirements provided by users, minimize the resources occupied by each task and all tasks, these two goals can cover the present invention oriented scene.
  • the present invention calculates average resource occupancy for each task according to resource usage and execution time of operators in each task.
  • the optimization objective used in the present invention is to minimize the sum of the average resource occupation of all tasks under the condition that the service quality is satisfied and the operator resource occupation in each task does not exceed the maximum available resources.
  • the resource configuration method can obtain the second data processing model corresponding to each second task, and obtain each second data processing model
  • the scheduling sequence and parallel execution state of the operators are used as a basis, and the resources of the server are allocated to all operators in the second data processing model. It can be seen from this that the resource configuration method of the present invention can be applied to complex scenarios of multiple data processing models.
  • the present invention effectively overcomes various shortcomings in the prior art and has high industrial utilization value.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

一种资源配置方法、介质及服务端。所述资源配置方法包括:获取所述服务端能够执行的任务作为第一任务(S11);获取所述第一任务对应的数据处理模型作为第一数据处理模型(S12),其中,各所述第一数据处理模型至少包含1个算子;对所述第一数据处理模型中的每个算子进行资源配置,以获取所述第一数据处理模型中各算子所使用的资源数量(S13);当所述服务端接收到用户的任务请求时,获取第二任务(S14);其中,所述第二任务包括所述服务端的当前任务和所述用户的任务请求对应的任务;当所述第二任务的数量大于1时,执行一协同资源配置子方法(S15)。所述资源配置方法能够适用于多数据处理模型的复杂场景。

Description

一种资源配置方法、介质及服务端 技术领域
本发明涉及一种资源分配方法,特别是涉及一种资源配置方法、介质及服务端。
背景技术
随着深度学习的迅猛发展,用户对于高性能云服务的要求也随之上升。而随着深度学习任务种类的日益多样化、数据处理模型的日益复杂以及用户的日益增加,深度学习的服务也面临更大的挑战。为了满足多任务、多模型和多用户的复杂场景对资源的需求,许多芯片厂商提供了算力水平较高的深度学习专用芯片以及相应编程框架供深度学习服务提供商使用,且服务提供商也会组合使用多个芯片。然而,发明人在实际应用中发现,目前的资源配置方法主要针对单数据处理模型进行性能优化,其很难应用到多数据处理模型的复杂场景中。
发明内容
鉴于以上所述现有技术的缺点,本发明的目的在于提供一种资源配置方法、介质及服务端,用于解决现有资源配置方法主要针对单数据处理模型进行性能优化,其很难应用到多数据处理模型的复杂场景中的问题。
为实现上述目的及其他相关目的,本发明的第一方面提供一种资源配置方法;所述资源配置方法应用于多核架构的服务端,包括:获取所述服务端能够执行的任务作为第一任务;获取所述第一任务对应的数据处理模型作为第一数据处理模型,其中,各所述第一数据处理模型至少包含1个算子;对所述第一数据处理模型中的每个算子进行资源配置,以获取所述第一数据处理模型中各算子所使用的资源数量;当所述服务端接收到用户的任务请求时,获取第二任务;其中,所述第二任务包括所述服务端的当前任务和所述用户的任务请求对应的任务;当所述第二任务的数量大于1时,执行一协同资源配置子方法;其中,所述协同资源配置子方法包括:获取所述第二任务对应的数据处理模型作为第二数据处理模型;根据所述第一数据处理模型中各算子所使用的资源数量,获取所述第二数据处理模型中各算子所使用的资源数量;获取所述第二数据处理模型中各算子的调度顺序和并行执行状态;根据所述第二数据处理模型中各算子所使用的资源数量、调度顺序和并行执行状态对所述服务端的资源进行配置。
于所述第一方面的一实施例中,对于所述第一数据处理模型中的任一算子,获取该算子 所使用的资源数量的方法包括:分别为该算子配置不同数量的资源,并获取各配置对应的算子性能;根据各配置对应的算子性能,获取该算子所使用的资源数量。
于所述第一方面的一实施例中,所述资源配置方法还包括:根据所述第一数据处理模型中各算子所使用的资源数量,对所述第一数据处理模型中的算子进行算子融合和/或算子切分。
于所述第一方面的一实施例中,获取所述第二数据处理模型中各算子的调度顺序的实现方法包括:获取所述第二数据处理模型中各算子的性能模型;其中,所述性能模型包含算子的执行时间;获取各所述第二任务的服务质量要求;根据各所述第二任务的服务质量要求和所述第二数据处理模型中各算子的性能模型,获取所述第二数据处理模型中各算子的调度顺序。
于所述第一方面的一实施例中,在获取所述第二数据处理模型中各算子的并行执行状态后,所述协同资源配置子方法还包括:获取所述第二数据处理模型中算子之间的干扰模型;根据所述干扰模型,对所述第二数据处理模型中各算子的调度顺序和并行执行状态进行调整。
于所述第一方面的一实施例中,在获取所述第二数据处理模型中各算子的调度顺序和并行执行状态后,所述协同资源配置子方法还包括:根据所述第二数据处理模型中各算子所使用的资源数量、调度顺序和并行执行状态,获取所述服务端的资源使用状况;根据所述服务端的资源使用状况,对所述第二数据处理模型中至少1个算子所使用的资源数量进行调整。
于所述第一方面的一实施例中,当所述服务端接收到用户的任务请求时,获取第二任务的实现方法包括:停止当前正在执行的资源配置方案;从所述服务端的当前任务中,获取未完成的任务和子任务;将所述用户的任务请求对应的任务、所述未完成的任务和子任务作为所述第二任务。
于所述第一方面的一实施例中,所述资源配置方法以所述服务端的内核为单位进行资源配置。
本发明的第二方面提供一种计算机可读存储介质,其上存储有计算机程序;该计算机程序被处理器执行时实现本发明第一方面所述的资源配置方法。
本发明的第三方面提供一种服务端;所述服务端为多核架构,且所述服务端包括:存储器,存储有一计算机程序;处理器,与所述存储器通信相连,调用所述计算机程序时执行本发明第一方面所述的资源配置方法;显示器,与所述处理器和所述存储器通信相连,用于显示所述资源配置方法的相关GUI交互界面。
如上所述,本发明所述资源配置方法、介质及服务端的一个技术方案具有以下有益效果:
在所述服务端根据用户的请求需要执行2个或多个第二任务时,所述资源配置方法能够获取每个第二任务对应的第二数据处理模型,并获取每个第二数据处理模型中算子的调度顺序和并行执行状态,并以此为依据将所述服务端的资源配置给所有第二数据处理模型中的算子。由此可知,本发明所述资源配置方法能够适用于多数据处理模型的复杂场景。
附图说明
图1显示为本发明实施例所述资源配置方法于一具体实施例中获取的任务信息及任务所需文件。
图2A显示为本发明所述资源配置方法于一具体实施例中的流程图。
图2B显示为本发明所述资源配置方法于一具体实施例中步骤S15的流程图。
图2C显示为本发明所述资源配置方法于一具体实施例中获取的资源配置方案示例图。
图3显示为本发明所述资源配置方法于一具体实施例中步骤S13的流程图。
图4显示为本发明所述资源配置方法于一具体实施例中进行算子融合和算子切分的示例图。
图5显示为本发明所述资源配置方法于一具体实施例中获取算子调度顺序的流程图。
图6A显示为本发明所述资源配置方法于一具体实施例中对算子的调度顺序和并行执行状态进行调整的流程图。
图6B显示为本发明所述资源配置方法于一具体实施例中获取干扰模型的流程图。
图7A显示为本发明所述资源配置方法于一具体实施例中对算子所使用的资源数量进行调整的流程图。
图7B显示为本发明所述资源配置方法于一具体实施例中对算子所使用的的资源数量进行调整的示例图。
图8显示为本发明所述资源配置方法于一具体实施例中步骤S14的流程图。
图9显示为本发明所述资源配置方法于一具体实施例中的流程图。
图10显示为本发明所述服务端于一具体实施例中的结构示意图。
元件标号说明
100       服务端
110       存储器
120       处理器
130         显示器
S11~S15    步骤
S151~S154  步骤
S131~S132  步骤
S51~S53    步骤
S61~S62    步骤
S71~S72    步骤
S141~S143  步骤
S91~S99    步骤
具体实施方式
以下通过特定的具体实例说明本发明的实施方式,本领域技术人员可由本说明书所揭露的内容轻易地了解本发明的其他优点与功效。本发明还可以通过另外不同的具体实施方式加以实施或应用,本说明书中的各项细节也可以基于不同观点与应用,在没有背离本发明的精神下进行各种修饰或改变。需说明的是,在不冲突的情况下,以下实施例及实施例中的特征可以相互组合。
需要说明的是,以下实施例中所提供的图示仅以示意方式说明本发明的基本构想,图示中仅显示与本发明中有关的组件而非按照实际实施时的组件数目、形状及尺寸绘制,其实际实施时各组件的型态、数量及比例可为一种随意的改变,且其组件布局型态也可能更为复杂。此外,此外,在本文中,诸如“第一”、“第二”等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。
在服务端为用户提供服务的过程中,由用户发送任务请求至服务端,服务端响应用户的任务请求并执行相应的任务。具体地,用户会发送若干任务请求至所述服务端,其中,每个任务请求对应一个任务,每个任务包含多个子任务,每个子任务对应1个或多个数据处理模型及任务文件,且每个数据处理模型均包含1个或多个算子。例如,请参阅图1,对于任务1,其对应目标检测和目标追踪两个子任务,且目标检测子任务对应YOLO-V3模型,目标追踪对应GOTURN模型;其中,YOLO-V3模型和GOTURN模型均包含多个算子,例如:卷积算子、池化算子、全连接算子等。
在实际应用中,当发送任务请求的用户的数量增加和/或用户请求的任务数量增加时,最 终都会导致数据处理模型数量的增加,因此,多用户、多任务和/或多数据处理模型的复杂场景归根到底均可看作多数据处理模型的复杂场景,这种多数据处理模型的复杂场景会对服务端的资源配置带来极大的挑战。然而,发明人在实际应用中发现,现有的资源配置方法主要针对单数据处理模型进行性能优化,其很难应用到多数据处理模型的复杂场景中。
针对这一问题,本发明提供一种资源配置方法,所述资源配置方法应用于多核架构的服务端。具体地,在所述服务端根据用户的请求需要执行2个或多个第二任务时,所述资源配置方法能够获取每个第二任务对应的第二数据处理模型,并获取每个第二数据处理模型中算子的调度顺序和并行执行状态,并以此为依据将所述服务端的资源配置给所有第二数据处理模型中的算子。由此可知,本发明所述资源配置方法能够适用于多数据处理模型的复杂场景。
于本发明的一实施例中,所述资源配置方法应用于多核架构的服务端。请参阅图2A,所述资源配置方法包括:
S11,获取所述服务端能够执行的任务作为第一任务;其中,所述第一任务对应于所述服务端能够为用户提供的服务。由于所述服务端为用户提供的是一组事先确定的服务,因而根据该事先确定的服务能够直接获取所述服务端能够执行的任务,进而获取所述第一任务。例如,对于某一服务器,若该服务器能够为用户提供的服务包括目标检测跟踪服务和地图构建服务,则该服务器能够执行的任务包括目标检测跟踪任务和地图构建任务。
S12,获取所述第一任务对应的数据处理模型作为第一数据处理模型,其中,各所述第一数据处理模型至少包含1个算子。具体地,由于本发明所述资源配置方法主要面向多用户、多数据处理模型情况下的云服务场景,该场景中包含大量的任务信息,该任务信息为先验信息,即:在服务端接收到用户请求之前即可获取的信息。所述任务信息例如为所述第一任务的逻辑结构、模型的结构、算子的类型和参数等,因此,步骤S12能够根据所述任务信息获取所述第一数据处理模型。例如,请参阅图1,若步骤S11中获取的第一任务包括任务2,该任务2的任务信息包括:该任务的名称为地图构建,其逻辑结构包括视觉里程计、地图重建、回环检测,其数据处理模型包括DeepVO、CNN-SLAM和SDA-based,并且,根据该数据处理模型能够直接获取其包含的算子的类型和参数。
S13,对所述第一数据处理模型中的每个算子进行资源配置,以获取所述第一数据处理模型中各算子所使用的资源数量。所述服务端的资源可以为存储资源、计算资源等,特别地,所述资源配置算法以内核为单位对所述服务端的资源进行配置,此时,各算子所使用的资源数量可以通过各算子所使用的内核数量进行表示;例如,算子1所使用的资源数量为8个内核,算子2所使用的资源数量为16个内核等。
上述步骤S11-S13通常在运行前(编译期)执行,当上述步骤S11-S13执行完成后,所述服务端即可开始为用户提供服务,此后,用户可以通过发送任务请求的方式请求所述服务端执行对应的任务。
S14,当所述服务端接收到用户的任务请求时,获取第二任务;其中,所述第二任务包括所述服务端的当前任务和所述用户的任务请求对应的任务。所述服务端的当前任务包括所述服务端正在执行的任务和所述服务端尚未执行的任务,因此,所述第二任务包括:所述用户的任务请求对应的任务、所述服务端当前正在执行的任务和所述服务端尚未执行的任务。所述用户的任务请求可以来源于同一用户,也可以来源于多个用户。特别地,当所述服务端当前处于空闲状态时,所述第二任务仅包括所述用户的任务请求对应的任务。
S15,当所述第二任务的数量大于1时,执行一协同资源配置子方法以获取所述服务端的资源配置方案。此外,当所述第二任务的数量为1时,在资源配置过程中不考虑不同任务之间的相互影响,此时,对所述服务端的资源进行配置的方法包括:获取所述第二任务对应的数据处理模型所包含的算子;对于其中的任一算子,根据该算子所使用的资源数量将所述服务端中相应数量的资源配置给该算子。
具体地,请参阅图2B,本实施例中所述协同资源配置子方法包括:
S151,获取所述第二任务对应的数据处理模型作为第二数据处理模型。
S152,根据所述第一数据处理模型中各算子所使用的资源数量,获取所述第二数据处理模型中各算子所使用的资源数量。由于所述第二任务是用户请求所述服务端执行的任务,而所述第一任务是所述服务端能够执行的任务,因此,每个第二任务均包含于所述第一任务,故:根据所述第一任务对应的第一数据处理模型中各算子所使用的资源数量,能够获取所述第二任务对应的第二数据处理模型中各算子所使用的资源数量。
S153,获取所述第二数据处理模型中各算子的调度顺序和并行执行状态。具体地,受限于算子之间的相互依赖关系以及所述服务端最大可用资源,所述第二数据处理模型中的某些算子在时间上必须先后执行,例如图2C中的算子OP-1和算子OP-4,所述算子的调度顺序用于描述算子的先后执行顺序。此外,在确保服务端的资源充足的情况下,所述服务端也可以并行执行(即:同时执行)2个或多个算子以提高性能,例如图2C中的算子OP-1、OP-2和OP-3;对于某一算子来说,其并行执行状态用于描述该算子是否可以与其他算子并行执行,和/或与该算子并行执行的算子的数量和名称。
S154,根据所述第二数据处理模型中各算子所使用的资源数量、调度顺序和并行执行状态对所述服务端的资源进行配置,从而生成所述服务端的资源配置方案。具体地,当各算子 的调度顺序和并行执行状态确定以后,结合各算子所使用的资源数量,步骤S153能够实现对所述服务端的资源进行配置。例如,若算子OP-1、OP-2、OP-3和OP-4所使用的资源数量分别为8内核、8内核、8内核和32内核,其执行顺序为先执行算子OP-1、OP-2和OP-3,再执行算子OP-4,且算子OP-1、OP-2和OP-3可以并行执行;若所述服务端的内核总数为32,则根据以上信息,所述服务端可以同时分配8内核给算子OP-1、OP-2和OP-3,当算子OP-1、OP-2和OP-3均执行完成后,所述服务端分配32内核给算子OP-4,从而完成对其资源的配置。
在所述服务端根据用户的请求需要执行2个或多个第二任务时,所述资源配置方法能够获取每个第二任务对应的第二数据处理模型,并获取每个第二数据处理模型中算子的调度顺序和并行执行状态,并以此为依据将所述服务端的资源配置给所有第二数据处理模型中的算子。由此可知,本发明所述资源配置方法能够适用于多数据处理模型的复杂场景。
根据以上描述可知,上述步骤S11-S13实现的是单个数据处理模型中算子的资源配置,在该过程中可以不考虑不同任务之间的相互影响,因而该过程可以看作单任务资源配置阶段。步骤S14-S15实现的是至少2个数据处理模型中算子的资源配置,在该过程中需要考虑不同任务之间的相互影响,因而该过程可以看作多任务协同资源配置阶段。
相对于单数据处理模型的优化,多数据处理模型给数据复用、以算子为粒度的共享资源抢占、服务运行顺序等方面带来了较大的挑战,相较之下,本实施例所述资源配置方法以模型为粒度进行资源配置,因而能够获取更多的先验信息,从而保证了所述资源配置方法能够获得更好的服务保证率和更低的能耗。
此外,由于GPU/TPU的运行模型与多核架构的芯片差异较大,而一些实施例中主要针对CPU+GPU/CPU+TPU等异构集群进行资源配置,这种配置方式主要关注GPU/TPU上的计算-访存操作的重叠,而在多核架构上除计算-访存重叠外,还需要考虑内核分配、算子流水线执行等因素,这就导致这些实施例中的资源配置方案在多核结构上的支持并不完善。相较之下,本实施例所述资源配置方法能够以算子之间的调度顺序和并行执行状态为依据对所述服务端的内核进行配置,因而能够充分考虑内核分配和算子流水线执行的因素,因此,本实施例所述资源配置方法能够适用于多核架构的服务端。
于本发明的一实施例中,由于所述第一任务、所述第一数据处理模型及其中的算子均为预先可知的先验信息,因此,对于所述第一数据处理模型中的任一算子,该算子所使用的资源数量可以采用性能测试的方式获得。具体地,请参阅图3,采用性能测试的方法获取所述第一数据处理模型中任一算子所使用的资源数量的实现方法包括:
S131,分别为该算子配置不同数量的资源,并获取各配置对应的算子性能。其中,所述算子性能需要兼顾算子的执行时间以及算子所使用的资源数量,从而在满足服务质量的情况下最小化平均资源占用,例如,所述算子性能可以通过算子的执行时间以及算子所使用的资源数量的乘积进行描述。具体地,各配置对应的算子性能可以通过在该配置下实际执行该算子获得。例如,可以分别为该算子配置1个内核、2个内核、……、32个内核,并在该算子配置1个内核时执行该算子以获取配置1个内核时的算子性能、在该算子配置2个内核时执行该算子以获取配置2个内核时的算子性能、……、在该算子配置32个内核时执行该算子以获取配置32个内核时的算子性能。
S132,根据各配置对应的算子性能,获取该算子所使用的资源数量。优选地,从该算子的所有的配置中选取算子性能最佳的一种配置,并将该配置对应的资源数量作为该算子所使用的资源数量。例如,若该算子在配置8个内核时所对应的算子性能最佳,则该算子所使用的资源数量为8个内核。
本实施例中,由于所述资源配置方法主要针对多核架构的服务端及其包含的深度学习专用芯片,考虑到多个无数据依赖关系的算子之间可以空分共享多个内核的资源,因而本实施例采用性能测试的方式获取所述第一数据处理模型中任一算子所使用的资源数量,此种方式能够保证每个算子采用最为经济的资源分配配置,不仅能够使算子取得可接受的性能,同时能够尽可能降低资源占用,从而为其他算子尽可能多地提供可用资源。
于本发明的一实施例中,在获取所述第一数据处理模型中各算子所使用的资源数量以后,所述资源配置方法还包括:根据所述第一数据处理模型中各算子所使用的资源数量,对所述第一数据处理模型中的各算子进行图级别优化,其中,所述图级别优化包括算子融合和/或算子切分。
具体地,所述算子融合是指在所述服务端的最大可用资源的限制下,将所述第一数据处理模型中若干个连续的、使用资源较少且属于同一任务的算子进行融合;其中,通过算子融合能够将至少2个算子融合为1个算子。所述算子融合能够通过流水化的执行以增加并行度从而减小访存开销。例如,请参阅图4,算子卷积1、池化1、归一化1和激活1为连续的、使用8个内核且属于同一任务的4个算子,在进行图级别优化时可以通过算子融合将其融合为1个算子,该算子使用16个内核进行运算;由此可知,通过算子融合能够增加并行度并减少访存开销。
所述算子切分是指在所述服务端的最大可用资源的限制以及不影响性能的情况下,将运行时间较长的算子切分为2个或多个算子。切分后的算子由于调度粒度更小,在多协同资源 配置阶段能够提供更高的灵活性。例如,请参阅图4,算子卷积2为一使用16内核的、运行时间较长的算子,在算子切分过程中可以根据所述服务端的内核使用情况,将卷积2这一算子切分为使用16内核的算子(卷积2a)和使用32内核的算子(卷积2b)。
根据以上描述可知,本实施例通过对所述第一数据处理模型中的算子进行图级别优化,能够为所述第一数据处理模型中算子的资源配置提供更高的灵活性、更高的并行度以及更小的访存开销;并且,通过将所述算子融合和所述算子切分相结合能够有效削减资源配置过程中由于运行时间、资源使用带来的硬件空转和资源浪费,并能够有效填充硬件资源被浪费的部分,以提高性能。
请参阅图5,于本发明的一实施例中,获取所述第二数据处理模型中各算子的调度顺序的实现方法包括:
S51,获取所述第二数据处理模型中各算子的性能模型;其中,所述性能模型包含算子的执行时间,可以采用基于性能测试的方式获取。具体地,在编译时期,分别以不同的配置运行所述第二数据处理模型中的各个算子,并根据算子的执行时间等参数构建算子的性能模型。此外,在实际运行中,还可以根据算子的实际运行情况对所述性能模型进行实时更新,以提升所述性能模型的精确度,从而获得更为优化的资源配置。
S52,获取各所述第二任务的服务质量要求。其中,各所述第二任务的服务质量要求可以从用户的任务请求中获取。
S53,根据各所述第二任务的服务质量要求和所述第二数据处理模型中各算子的性能模型,获取所述第二数据处理模型中各算子的调度顺序。例如,可以适当延后对服务质量要求较低的任务中算子的执行,以将所述服务端的资源优先提供给服务质量要求较高的任务中的算子使用。此外,还可以根据算子的执行时间以及所述第二任务的服务质量要求综合考虑所述第二数据处理模型中各算子的调度顺序。
于本发明的一实施例中,考虑到在同时执行不同任务中的算子时(即:当不同任务中的算子并行执行时),算子会由于其他任务中的算子占用共享资源(例如缓存、带宽等)而受到干扰并导致性能下降;当所述干扰比较严重时,并行执行两个算子的时间可能会超过串行执行这两个算子的时间,此时,并行执行的性能要差于串行执行。针对这一问题,请参阅图6A,本实施例中在获取所述第二数据处理模型中各算子的并行执行状态后,所述协同资源配置子方法还包括:
S61,获取所述第二数据处理模型中算子之间的干扰模型。其中,请参阅图6B,所述算子之间的干扰模型的构建方法包括量化共享资源需求、性能测试和构建分析模型。具体地, 所述量化共享资源需求是指对多个算子之间需求的共享资源进行量化,所述共享资源例如为缓存、带宽等。在所述性能测试过程中,可以采用随机生成的算子参数和/或常见网络的算子参数对算子进行性能测试(即:采用多种算子参数分别执行算子,以获取不同算子参数对应的性能),从而获取算子之间的干扰情况。在所述构建分析模型过程中,可以利用线性回归模型、神经网络模型等方式对所述算子之间的干扰情况进行建模,从而获取所述第二数据处理模型中算子之间的干扰模型。
S62,根据所述干扰模型,对所述第二数据处理模型中各算子的调度顺序和并行执行状态进行调整。具体地,对于并行执行的两个算子,若二者之间的干扰较大,则步骤S62将这两个算子调整为串行执行,并根据这两个算子的性能模型和/或服务质量要求等调整这两个算子的调度顺序。例如,若算子1和算子2为并行执行的算子,根据所述干扰模型获知算子1和算子2之间的干扰较大,此时,需要将算子1和算子2改为串行执行,并且,还需要重新确定算子1和算子2之间的调度顺序。
本实施例能够根据所述干扰模型对所述第二数据处理模型中各算子的调度顺序和并行执行状态进行调整,使得所述资源配置方法能够减少甚至消除不同任务中的算子之间由于并行执行而引入的干扰,有利于提升资源配置的准确度。
于本发明的一实施例中,在获取所述第二数据处理模型中各算子的调度顺序和并行执行状态后,为消除某些情况下由于资源占用、调度顺序引发的硬件资源浪费,请参阅图7A,本实施例所述协同资源配置子方法还包括:
S71,根据所述第二数据处理模型中各算子所使用的资源数量、调度顺序和并行执行状态,获取所述服务端的资源使用状况。所述服务端的资源使用状况包括所述服务端是否存在空闲资源以及空闲资源的数量。例如,图7B中,在调整前由于算子OP-1、OP-2和OP-3并行执行,且三者均与算子OP-4串行执行,此时,在算子OP-1、OP-2和OP-3执行的过程中,所述服务端存在8个空闲的内核。
S72,根据所述服务端的资源使用状况,对所述第二数据处理模型中的1个或多个算子所使用的资源数量进行调整。具体地,如果在某一时刻所述服务端存在空闲资源,则步骤S72将该空闲资源分配给该时刻正在执行的1个或多个算子。如图7B所示,在算子OP-1、OP-2和OP-3执行的过程中,所述服务端存在8个空闲的内核,此时,步骤S72可以将所述服务端空闲的8个内核分配给算子OP-1使用。
本实施例能够根据所述服务端的资源使用状况对所述第二数据处理模型中的1个或多个算子所使用的资源数量进行调整,从而尽可能地减少甚至消除由于资源占用、调度顺序引发 的硬件资源浪费。
对于所述服务端来说,由于用户众多、服务类型众多且任务请求到达的时间无法准确确定,因而不可能静态地穷举所有可能出现的情况并给出最优资源配置方案,故服务端需要在运行时实时响应用户发送的任务请求。为实现这一目标,于本发明的一实施例中,请参阅图8,当所述服务端接收到用户的任务请求时,获取所述服务端的当前任务和所述任务请求对应的任务作为所述第二任务的实现方法包括:
S141,停止当前正在执行的资源配置方案。其中,所述当前正在执行的资源配置方案是根据所述服务端此前接收到的任务请求所生成的资源配置方案。通过执行本步骤使得所述资源配置方法在所述服务端接收到新的任务请求时,抛弃此刻之前原有的资源配置方案。
S142,从所述服务端的当前任务中,获取未完成的任务和子任务。其中,所述服务端的当前任务包括所述服务端正在执行的任务和尚未开始执行的任务;所述未完成的任务是指所述服务端尚未开始执行的任务,所述未完成的子任务是指所述服务端正在执行的任务中未执行完或未开始执行的子任务。
S143,将所述用户的任务请求对应的任务、所述未完成的任务和子任务作为所述第二任务。此后,所述资源配置方法根据该第二任务执行所述协同资源配置子方法,从而产生新的资源配置方案。
本实施例中,由于所述资源配置方法主要面向多核架构的服务端这一场景,该场景中,一旦服务类型确定其任务的结构(包括任务中的子任务、逻辑依赖关系)即可确定。因此,本实施例能够在所述服务端的运行过程中实时响应用户的任务请求。
于本发明的一实施例中,在对所述服务端的资源进行配置完成以后,所述资源配置方法还包括:在生成代码、编译为可执行会话时,将具体的深度学习专用芯片编程模型配置为支持JIT,并根据实际算子运行情况对资源配置、调度顺序、并行执行状态进行动态的调整。
于本发明的一实施例中,所述资源配置方法以所述服务端的内核为单位进行资源配置。此时,所述资源配置方法配置给每个算子的资源为整数个内核。
于本发明的一实施例中,所述资源配置方法应用于多核架构的服务端,并以内核为单位对所述服务端的资源进行配置。本实施例所述资源配置方法面向多用户、多数据处理模型情况下的云服务场景,在这些场景中请求的粒度较大、包含的先验信息较多。因此,本实施例能够充分利用所述先验信息,其中,所述先验信息例如为任务的逻辑结构、任务中模型的结构、算子的类型和参数等,在编译期对每个模型进行优化并对算子之间的干扰进行建模,从而在多个服务请求到来时生成相对于独立优化若干数据处理模型更为优秀的资源配置方案。 具体地,请参阅图9,本实施例中所述资源配置方法包括:
S91,获取所述服务端能够执行的任务作为第一任务。
S92,获取所述第一任务的任务信息及任务所需文件。其中,所述任务信息、任务所需文件作为先验信息在编译时期提供。
S93,获取所述第一任务对应的第一数据处理模型,并将其翻译为统一中间表达。通常情况下,所述服务端支持多种数据处理模型,因此,有必要将所有的数据处理模型转换为统一的中间模型。
S94,对各所述第一任务进行单任务资源配置。其中,所述单任务资源配置过程可以在运行前(编译期)执行。所述单任务资源配置的目标为:在满足服务质量的情况下,最小化平均资源占用。具体地,所述单任务资源配置方法针对多核架构优化模型中每个算子所使用的内核数量进行模型的图级别优化(包括算子融合和算子拆分),并生成优化后的资源配置以供多模型协同资源配置阶段使用。并且,在所述单任务资源配置阶段还会完成对所述第一数据处理模型中算子的性能模型的构建,以供所述多模型协同资源配置阶段使用。
S95,在所述单任务资源配置完成后,利用所述服务端为用户提供服务。
S96,当所述服务端接收到用户的任务请求时,获取第二任务;其中,所述第二任务包括所述服务端的当前任务和所述用户的任务请求对应的任务;每个所述第二任务均包含于所述第一任务,且所述用户的任务请求中包括所述第二任务的服务质量要求。
S97,对所述第二任务进行多任务协同资源配置。具体地,步骤S94能够在满足服务质量要求的情况下获取每个数据处理模型的最优资源配置,为保证多模型共享硬件资源情况下的服务质量,本步骤将考虑算子调度顺序、算子之间的相互干扰,对步骤S94中获取的单任务资源配置进行合并从而生成总体的资源配置,以保证尽可能多的任务满足服务质量要求。
S98,根据所述多任务协同资源配置的结果进行编译执行。
S99,在执行过程中或者执行完成后,根据执行时间、是否满足服务质量要求等反馈信息,对步骤S94中获取的算子的性能模型进行动态调整,以实现更优化的资源配置。
此外,对于实时到达所述服务端的任务请求,本实施例所述资源配置方法采用如下方式进行处理:在新的任务请求到来时,抛弃此刻之前的原有所有任务的资源配置方案,并将原有任务的剩余部分结合新的任务请求进行协同优化,从而产生资源配置方案。
所述资源配置方法在单任务资源配置阶段,由于算子类型、参数不同,其计算密集度、对硬件资源的要求也不尽相同。其中,对计算量较小的算子,为其配置过多的资源会造成计算与访存不匹配以及通信开销大的问题,因此,即便为这些算子分配更多的资源对其性能的 提升也不大,甚至有可能造成性能下降。针对这一问题,本实施例在所述单任务资源配置阶段针对每个算子采用性能测试的方式来获取最为经济的资源配置。
并且,步骤S94中采用的图级别优化能够在单任务配置阶段提供更高的灵活性、更高的并行度以及更小的访存开销。具体地,在确定每个算子的资源配置后,步骤S94会寻找连续的、使用内核较少的算子进行融合以通过流水化的执行来增加并行度,并在不影响性能的情况下尽可能地将运行时间较长的算子拆分为占用内核更少的算子,由于切分后的算子调度粒度更小,因而能够为多协同资源配置阶段提供更高的灵活性。因此,通过所述图级别优化能够有效填充引荐资源被浪费的部分,从而提升资源配置的性能。
此外,在从单任务资源配置阶段过渡到多任务协同资源配置阶段的过程中,不用数据处理模型的算子之间存在相互干扰,这种干扰会对系统性能产生较为显著的影响。不同类型的算子对共享资源的要求不尽相同,例如,存储密集型算子对带宽要求较高,而计算密集型算子对片上缓存要求较高。针对这一问题,本实施例采用步骤S61~S62所示的量化-性能测试构建分析模型的方式,通过对不同类型、参数和算子之间的干扰进行分析从而确定若干模型的并行运行模式,以尽可能降低干扰造成的影响。对于两个或多个算子来说,如果并行运行会花费高于串行运行的时间,则这些算子对共享资源的抢占会导致它们并不适合同时运行,通过干扰模型能够充分避免这种情况。此外,在考虑干扰的情况下,所述多任务协同资源配置还能够充分考虑用户提供的服务质量要求,并对不同模型中算子的运行顺序进行合理的调整,从而达到最大化满足服务质量要求任务比例的目标。
基于以上对所述资源配置方法的描述,本发明还提供一种计算机可读存储介质,其上存储有计算机程序;该计算机程序被处理器执行时实现本发明所述资源配置方法。
基于以上对所述资源配置方法的描述,本发明还提供一种服务端,所述服务端为多核架构。请参阅图10,于本发明的一实施例中,所述服务端100包括:存储器110,存储有一计算机程序;处理器120,与所述存储器110通信相连,调用所述计算机程序时执行本发明所述的资源配资方法;显示器130,与所述存储器110和所述处理器120通信相连,用于显示所述资源配置方法的相关GUI交互界面。
本发明所述的资源配置方法的保护范围不限于本实施例列举的步骤执行顺序,凡是根据本发明的原理所做的现有技术的步骤增减、步骤替换所实现的方案都包括在本发明的保护范围内。
本发明提供了一种资源配置方法、介质及电子设备。所述资源配置方法是针对多用户、多数据处理模型场景的面向多核架构深度学习专用芯片的编译和资源配置的优化方法,能够 在多种场景下实时地结合多个数据处理模型的信息对资源配置进行跨模型协同优化,以提高总体服务质量、满足服务率,且最小化系统能耗。所述资源配置方法能够在多个用户提供任务信息、服务要求以及任务所需文件后,结合硬件特征先后进行单任务资源配置的优化、多任务资源配置的优化,并最终产生可执行服务会话为用户提供服务。
现有方法对于服务调度的研究主要以最大化时延敏感服务的满足率,最大化硬件利用率为目标,针对服务之间的资源共享、任务的运行顺序等设计实时的、低开销的调度算法进行调度。与现有方法不同,本发明考虑到在为多个用户、多个任务提供服务时,需要尽可能多地满足用户对任务的需求,该目标是最直观的目标。为便于优化,本发明将该目标转化为一对偶的目标,即:在满足用户提供的服务质量要求的情况下,最小化每个任务以及所有任务占用的资源,这两个目标能够涵盖本发明面向的场景。在量化资源占用时,本发明针对每个任务,根据各任务中算子的资源使用以及执行时长计算平均资源占用。最终,本发明所使用的优化目标为:在保证服务质量满足、每个任务中算子资源占用不超过最大可用资源的情况下,最小化所有任务平均资源占用之和。
在所述服务端根据用户的请求需要执行2个或多个第二任务时,所述资源配置方法能够获取每个第二任务对应的第二数据处理模型,并获取每个第二数据处理模型中算子的调度顺序和并行执行状态,并以此为依据将所述服务端的资源配置给所有第二数据处理模型中的算子。由此可知,本发明所述资源配置方法能够适用于多数据处理模型的复杂场景。
综上所述,本发明有效克服了现有技术中的种种缺点而具高度产业利用价值。
上述实施例仅例示性说明本发明的原理及其功效,而非用于限制本发明。任何熟悉此技术的人士皆可在不违背本发明的精神及范畴下,对上述实施例进行修饰或改变。因此,举凡所属技术领域中具有通常知识者在未脱离本发明所揭示的精神与技术思想下所完成的一切等效修饰或改变,仍应由本发明的权利要求所涵盖。

Claims (10)

  1. 一种资源配置方法,其特征在于,应用于多核架构的服务端,所述资源配置方法包括:
    获取所述服务端能够执行的任务作为第一任务;
    获取所述第一任务对应的数据处理模型作为第一数据处理模型,其中,各所述第一数据处理模型至少包含1个算子;
    对所述第一数据处理模型中的每个算子进行资源配置,以获取所述第一数据处理模型中各算子所使用的资源数量;
    当所述服务端接收到用户的任务请求时,获取第二任务;其中,所述第二任务包括所述服务端的当前任务和所述用户的任务请求对应的任务;
    当所述第二任务的数量大于1时,执行一协同资源配置子方法;其中,所述协同资源配置子方法包括:
    获取所述第二任务对应的数据处理模型作为第二数据处理模型;
    根据所述第一数据处理模型中各算子所使用的资源数量,获取所述第二数据处理模型中各算子所使用的资源数量;
    获取所述第二数据处理模型中各算子的调度顺序和并行执行状态;
    根据所述第二数据处理模型中各算子所使用的资源数量、调度顺序和并行执行状态对所述服务端的资源进行配置。
  2. 根据权利要求1所述的资源配置方法,其特征在于,对于所述第一数据处理模型中的任一算子,获取该算子所使用的资源数量的方法包括:
    分别为该算子配置不同数量的资源,并获取各配置对应的算子性能;
    根据各配置对应的算子性能,获取该算子所使用的资源数量。
  3. 根据权利要求1所述的资源配置方法,其特征在于,所述资源配置方法还包括:
    根据所述第一数据处理模型中各算子所使用的资源数量,对所述第一数据处理模型中的算子进行算子融合和/或算子切分。
  4. 根据权利要求1所述的资源配置方法,其特征在于,获取所述第二数据处理模型中各算子的调度顺序的实现方法包括:
    获取所述第二数据处理模型中各算子的性能模型;其中,所述性能模型包含算子的执行时间;
    获取各所述第二任务的服务质量要求;
    根据各所述第二任务的服务质量要求和所述第二数据处理模型中各算子的性能模型,获取所述第二数据处理模型中各算子的调度顺序。
  5. 根据权利要求1所述的资源配置方法,其特征在于,在获取所述第二数据处理模型中各算子的并行执行状态后,所述协同资源配置子方法还包括:
    获取所述第二数据处理模型中算子之间的干扰模型;
    根据所述干扰模型,对所述第二数据处理模型中各算子的调度顺序和并行执行状态进行调整。
  6. 根据权利要求1所述的资源配置方法,其特征在于,在获取所述第二数据处理模型中各算子的调度顺序和并行执行状态后,所述协同资源配置子方法还包括:
    根据所述第二数据处理模型中各算子所使用的资源数量、调度顺序和并行执行状态,获取所述服务端的资源使用状况;
    根据所述服务端的资源使用状况,对所述第二数据处理模型中至少1个算子所使用的资源数量进行调整。
  7. 根据权利要求1所述的资源配置方法,其特征在于,当所述服务端接收到用户的任务请求时,获取第二任务的实现方法包括:
    停止当前正在执行的资源配置方案;
    从所述服务端的当前任务中,获取未完成的任务和子任务;
    将所述用户的任务请求对应的任务、所述未完成的任务和子任务作为所述第二任务。
  8. 根据权利要求1所述的资源配置方法,其特征在于:所述资源配置方法以所述服务端的内核为单位进行资源配置。
  9. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于:该计算机程序被处理器执行时实现权利要求1-8任一项所述的资源配置方法。
  10. 一种服务端,其特征在于,所述服务端为多核架构,且所述服务端包括:
    存储器,存储有一计算机程序;
    处理器,与所述存储器通信相连,调用所述计算机程序时执行权利要求1-8任一项所 述的资源配置方法;
    显示器,与所述处理器和所述存储器通信相连,用于显示所述资源配置方法的相关GUI交互界面。
PCT/CN2021/095765 2020-10-21 2021-05-25 一种资源配置方法、介质及服务端 WO2022083119A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011134276.0A CN112199196B (zh) 2020-10-21 2020-10-21 一种资源配置方法、介质及服务端
CN202011134276.0 2020-10-21

Publications (1)

Publication Number Publication Date
WO2022083119A1 true WO2022083119A1 (zh) 2022-04-28

Family

ID=74011298

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/095765 WO2022083119A1 (zh) 2020-10-21 2021-05-25 一种资源配置方法、介质及服务端

Country Status (2)

Country Link
CN (1) CN112199196B (zh)
WO (1) WO2022083119A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112199196B (zh) * 2020-10-21 2022-03-18 上海交通大学 一种资源配置方法、介质及服务端
CN114490116B (zh) * 2021-12-27 2023-03-24 北京百度网讯科技有限公司 数据处理方法、装置、电子设备及存储介质
CN116225669B (zh) * 2023-05-08 2024-01-09 之江实验室 一种任务执行方法、装置、存储介质及电子设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105843679A (zh) * 2016-03-18 2016-08-10 西北工业大学 自适应众核资源调度方法
CN109684088A (zh) * 2018-12-17 2019-04-26 南京理工大学 云平台资源约束的遥感大数据快速处理任务调度方法
CN109947619A (zh) * 2019-03-05 2019-06-28 上海交通大学 基于服务质量感知提高吞吐量的多资源管理系统及服务器
US20190391853A1 (en) * 2018-06-25 2019-12-26 International Business Machines Corporation Multi-tier coordination of destructive actions
CN111736987A (zh) * 2020-05-29 2020-10-02 山东大学 一种基于gpu空间资源共享的任务调度方法
CN112199196A (zh) * 2020-10-21 2021-01-08 上海交通大学 一种资源配置方法、介质及服务端

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9934071B2 (en) * 2015-12-30 2018-04-03 Palo Alto Research Center Incorporated Job scheduler for distributed systems using pervasive state estimation with modeling of capabilities of compute nodes
CN108491253A (zh) * 2018-01-30 2018-09-04 济南浪潮高新科技投资发展有限公司 一种计算任务处理方法以及边缘计算设备
CN111694653A (zh) * 2019-03-13 2020-09-22 阿里巴巴集团控股有限公司 计算系统中调整计算算子类型分布的方法、装置及系统
CN110750342B (zh) * 2019-05-23 2020-10-09 北京嘀嘀无限科技发展有限公司 调度方法、装置、电子设备及可读存储介质
CN110704182A (zh) * 2019-09-18 2020-01-17 平安科技(深圳)有限公司 深度学习的资源调度方法、装置及终端设备
CN110825511A (zh) * 2019-11-07 2020-02-21 北京集奥聚合科技有限公司 一种基于建模平台模型运行流程调度方法
CN111274019B (zh) * 2019-12-31 2023-05-12 深圳云天励飞技术有限公司 一种数据处理方法、装置及计算机可读存储介质
CN111367643A (zh) * 2020-03-09 2020-07-03 北京易华录信息技术股份有限公司 一种算法调度系统、方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105843679A (zh) * 2016-03-18 2016-08-10 西北工业大学 自适应众核资源调度方法
US20190391853A1 (en) * 2018-06-25 2019-12-26 International Business Machines Corporation Multi-tier coordination of destructive actions
CN109684088A (zh) * 2018-12-17 2019-04-26 南京理工大学 云平台资源约束的遥感大数据快速处理任务调度方法
CN109947619A (zh) * 2019-03-05 2019-06-28 上海交通大学 基于服务质量感知提高吞吐量的多资源管理系统及服务器
CN111736987A (zh) * 2020-05-29 2020-10-02 山东大学 一种基于gpu空间资源共享的任务调度方法
CN112199196A (zh) * 2020-10-21 2021-01-08 上海交通大学 一种资源配置方法、介质及服务端

Also Published As

Publication number Publication date
CN112199196B (zh) 2022-03-18
CN112199196A (zh) 2021-01-08

Similar Documents

Publication Publication Date Title
WO2022083119A1 (zh) 一种资源配置方法、介质及服务端
Glushkova et al. Mapreduce performance model for Hadoop 2. x
WO2020211205A1 (zh) 一种数据处理方法及相关产品
CN112465129B (zh) 片内异构人工智能处理器
Warneke et al. Exploiting dynamic resource allocation for efficient parallel data processing in the cloud
Liu et al. Task scheduling with precedence and placement constraints for resource utilization improvement in multi-user MEC environment
Asghari et al. A cloud resource management framework for multiple online scientific workflows using cooperative reinforcement learning agents
CN108897627B (zh) 针对典型容器的Docker动态调度方法
US11838384B2 (en) Intelligent scheduling apparatus and method
CN112181613B (zh) 异构资源分布式计算平台批量任务调度方法及存储介质
US20220121912A1 (en) Data processing method and apparatus
Guleria et al. Quadd: Quantifying accelerator disaggregated datacenter efficiency
Razavi et al. FA2: Fast, accurate autoscaling for serving deep learning inference with SLA guarantees
Sajnani et al. Latency aware optimal workload assignment in mobile edge cloud offloading network
Chen et al. Task partitioning and offloading in IoT cloud-edge collaborative computing framework: a survey
Pan et al. Sustainable serverless computing with cold-start optimization and automatic workflow resource scheduling
US11531565B2 (en) Techniques to generate execution schedules from neural network computation graphs
Yang et al. An offloading strategy based on cloud and edge computing for industrial Internet
Syrigos et al. Optimization of Execution for Machine Learning Applications in the Computing Continuum
Larsson et al. Quality-elasticity: Improved resource utilization, throughput, and response times via adjusting output quality to current operating conditions
US20170269968A1 (en) Operating system support for game mode
Thai et al. Algorithms for optimising heterogeneous Cloud virtual machine clusters
Rani et al. A workload-aware vm placement algorithm for performance improvement and energy efficiency in OpenStack cloud
Li et al. UDL: a cloud task scheduling framework based on multiple deep neural networks
Sharma et al. Multi-Faceted Job Scheduling Optimization Using Q-learning With ABC In Cloud Environment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21881543

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21881543

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 21881543

Country of ref document: EP

Kind code of ref document: A1