WO2023206097A1 - Ai model deployment method and apparatus, and electronic device and computer-readable medium - Google Patents

Ai model deployment method and apparatus, and electronic device and computer-readable medium Download PDF

Info

Publication number
WO2023206097A1
WO2023206097A1 PCT/CN2022/089379 CN2022089379W WO2023206097A1 WO 2023206097 A1 WO2023206097 A1 WO 2023206097A1 CN 2022089379 W CN2022089379 W CN 2022089379W WO 2023206097 A1 WO2023206097 A1 WO 2023206097A1
Authority
WO
WIPO (PCT)
Prior art keywords
graph
edge device
hardware configuration
node
nodes
Prior art date
Application number
PCT/CN2022/089379
Other languages
French (fr)
Chinese (zh)
Inventor
王海峰
张洪洋
Original Assignee
西门子股份公司
西门子(中国)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 西门子股份公司, 西门子(中国)有限公司 filed Critical 西门子股份公司
Priority to PCT/CN2022/089379 priority Critical patent/WO2023206097A1/en
Publication of WO2023206097A1 publication Critical patent/WO2023206097A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • AI artificial intelligence
  • an electronic device including: at least one memory configured to store computer readable code; at least one processor configured to call the computer readable code to execute each of the methods provided in the first aspect. step.
  • the term "includes” and variations thereof represent an open term meaning “including, but not limited to.”
  • the term “based on” means “based at least in part on.”
  • the terms “one embodiment” and “an embodiment” mean “at least one embodiment.”
  • the term “another embodiment” means “at least one other embodiment”.
  • the terms “first”, “second”, etc. may refer to different or the same object. Other definitions may be included below, whether explicit or implicit. The definition of a term is consistent throughout this specification unless the context clearly dictates otherwise.
  • the hardware configuration condition of the edge device may be, for example, random access memory (Random Access Memory, RAM) space or hard disk space.
  • RAM Random Access Memory
  • the hardware configuration condition of the edge device selected for comparison is RAM space.
  • Group the nodes in the graph for the first time so that the RAM space occupied by each subgraph in the grouped subgraphs is equal to or close to the edge device. of RAM space.
  • Step 104 When the hardware configuration conditions required for simulation operation are less than or equal to the hardware configuration conditions of the edge device, deploy the graph containing the first grouping information to the edge device, and the edge device executes each of the multiple subgraphs in sequence. .
  • the hardware configuration conditions corresponding to the graph containing the first grouping information are reduced to a preset degree to obtain the result after the first reduction.
  • hardware configuration conditions Use the hardware configuration conditions after the first reduction to group the nodes in the graph for the second time, and run the graph containing the second grouping information.
  • a method for grouping nodes in a graph starting from at least one input node in the graph, traversing the nodes in the graph according to preset rules, and grouping the nodes in the graph.
  • the preset rules include: graph connectivity rules, as well as the rules for priority access of related nodes that all parent nodes have visited and related nodes that do not have parent nodes.
  • the rule of graph connectivity refers to determining the node traversal method based on whether there are edges connecting any nodes.
  • the step of adding the child nodes directly connected to node 1 to the first preset array takes precedence over the step of removing node 7 from the first preset array, that is, when only one node is left in the first preset array When a node is to be taken out, the relevant node to be added is first added to the first preset array.
  • the embodiments of the present application also provide a computer-readable medium.
  • Computer-readable instructions are stored on the computer-readable medium. When executed by the processor, the computer-readable instructions cause the processor to execute the aforementioned AI model. Deployment method. Examples of computer-readable media include floppy disks, hard disks, magneto-optical disks, optical disks (such as CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD+RW), magnetic tape, non- Volatile memory cards and ROM. Alternatively, the computer-readable instructions may be downloaded from the server computer or the cloud by the communications network.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Debugging And Monitoring (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiments of the present application mainly relate to the field of artificial intelligence (AI), and in particular to an AI model deployment method and apparatus, and an electronic device and a computer-readable medium. The method comprises: parsing an AI model file into a graph, wherein the graph is composed of nodes; performing first-time grouping on the nodes in the graph, so that each sub-graph among a plurality of sub-graphs obtained by means of grouping meets a hardware configuration condition for an edge device; performing simulation running on the graph, which includes first-time grouping information; and when a hardware configuration condition required for simulation running is less than or equal to the hardware configuration condition for the edge device, deploying the graph, which includes the first-time grouping information, to the edge device, and the edge device sequentially executing each sub-graph among the plurality of sub-graphs.

Description

AI模型的部署方法、装置、电子设备和计算机可读介质Deployment method, device, electronic device and computer-readable medium of AI model 技术领域Technical field
本申请实施例主要涉及人工智能领域,尤其涉及一种AI模型的部署方法、装置、电子设备和计算机可读介质。The embodiments of this application mainly relate to the field of artificial intelligence, and in particular, to a deployment method, device, electronic device, and computer-readable medium of an AI model.
背景技术Background technique
近年来,人工智能领域发展迅速,人工智能(AI)模型层出不穷。然而在工业领域中,由于边缘设备有限的内存和计算能力,大型AI模型往往无法在边缘设备上进行良好地运行,而这也一直以来是工业应用中的难题。目前,一种方法是通过蒸馏或剪枝来降低大型AI模型的复杂性,也就是需要对AI模型进行一些修改或额外的训练,然而这种方式可能会影响AI模型的准确度和精准度。In recent years, the field of artificial intelligence has developed rapidly, and artificial intelligence (AI) models have emerged one after another. However, in the industrial field, due to the limited memory and computing power of edge devices, large AI models often cannot run well on edge devices, and this has always been a problem in industrial applications. Currently, one method is to reduce the complexity of large AI models through distillation or pruning, which requires some modifications or additional training to the AI model. However, this method may affect the accuracy and precision of the AI model.
发明内容Contents of the invention
本申请实施例提供一种AI模型的部署方法、装置、平台和计算机可读介质,在不影响大型AI模型的准确度和精准度的条件下,将大型AI模型部署于边缘设备。Embodiments of the present application provide an AI model deployment method, device, platform, and computer-readable medium to deploy large-scale AI models on edge devices without affecting the accuracy and accuracy of the large-scale AI models.
第一方面,提供一种AI模型的部署方法,包括:将AI模型文件解析为图,其中,所述图是由节点构成;针对所述图中的节点进行第一次分组,以使分组得到的多个子图中的每一子图的满足所述边缘设备的硬件配置条件;仿真运行所述包含第一次分组信息的图;当仿真运行所需要的硬件配置条件小于或等于所述边缘设备的硬件配置条件时,将所述包含第一次分组信息的图部署至所述边缘设备,所述边缘设备依次执行所述多个子图中的每一子图。The first aspect provides a method for deploying an AI model, including: parsing the AI model file into a graph, where the graph is composed of nodes; performing a first grouping on the nodes in the graph, so that the grouping is Each of the multiple subgraphs satisfies the hardware configuration conditions of the edge device; simulate and run the graph containing the first grouping information; when the hardware configuration conditions required for simulation operation are less than or equal to the edge device When the hardware configuration conditions are met, the graph containing the first grouping information is deployed to the edge device, and the edge device executes each of the plurality of subgraphs in sequence.
第二方面,提供一种AI模型的部署装置,包括用于执行第一方面提供的方法中各步骤的组成部分。A second aspect provides an AI model deployment device, including components for executing each step of the method provided in the first aspect.
第三方面,提供一种电子设备,包括:至少一个存储器,被配置为存储计算机可读代码;至少一个处理器,被配置为调用所述计算机可读代码,执行第一方面提供的方法中各步骤。In a third aspect, an electronic device is provided, including: at least one memory configured to store computer readable code; at least one processor configured to call the computer readable code to execute each of the methods provided in the first aspect. step.
第四方面,提供一种计算机可读介质,所述计算机可读介质上存储有计算机可读指令,所述计算机可读指令在被处理器执行时,使所述处理器执行第一方面提供的方法中各步骤。In a fourth aspect, a computer-readable medium is provided. Computer-readable instructions are stored on the computer-readable medium. When executed by a processor, the computer-readable instructions cause the processor to execute the method provided in the first aspect. Each step in the method.
附图说明Description of the drawings
以下附图仅旨在于对本申请实施例做示意性说明和解释,并不限定本申请实施例的范围。其中:The following drawings are only intended to schematically illustrate and explain the embodiments of the present application, and do not limit the scope of the embodiments of the present application. in:
图1是根据本申请一实施例的一种AI模型的部署方法的流程图;Figure 1 is a flow chart of an AI model deployment method according to an embodiment of the present application;
图2是根据本申请一实施例的一种图中节点的分组方法的示意图;Figure 2 is a schematic diagram of a method of grouping nodes in a graph according to an embodiment of the present application;
图3是根据本申请一实施例的一种AI模型的部署装置的示意图;Figure 3 is a schematic diagram of an AI model deployment device according to an embodiment of the present application;
图4是根据本申请一实施例的一种电子装置的示意图。FIG. 4 is a schematic diagram of an electronic device according to an embodiment of the present application.
附图标记说明Explanation of reference signs
100:AI模型的部署方法      101-104:方法步骤100: Deployment method of AI model 101-104: Method steps
30:AI模型的部署装置       31:发送模块            32:边缘设备仿真器30: AI model deployment device 31: Sending module 32: Edge device emulator
33:图分割器               34:图解析器            35:调整器33: Graph splitter 34: Graph parser 35: Adjuster
400:电子设备              401:存储器             402:处理器400: Electronic equipment 401: Memory 402: Processor
具体实施方式Detailed ways
现在将参考示例实施方式讨论本文描述的主题。应该理解,讨论这些实施方式只是为了使得本领域技术人员能够更好地理解从而实现本文描述的主题,并非是对权利要求书中所阐述的保护范围、适用性或者示例的限制。可以在不脱离本申请实施例内容的保护范围的情况下,对所讨论的元素的功能和排列进行改变。各个示例可以根据需要,省略、替代或者添加各种过程或组件。例如,所描述的方法可以按照与所描述的顺序不同的顺序来执行,以及各个步骤可以被添加、省略或者组合。另外,相对一些示例所描述的特征在其它例子中也可以进行组合。The subject matter described herein will now be discussed with reference to example implementations. It should be understood that these embodiments are discussed only to enable those skilled in the art to better understand and implement the subject matter described herein, and are not intended to limit the scope, applicability, or examples set forth in the claims. The functions and arrangements of the discussed elements may be changed without departing from the scope of the embodiments of the present application. Each example may omit, substitute, or add various procedures or components as needed. For example, the described methods may be performed in an order different from that described, and individual steps may be added, omitted, or combined. Additionally, features described with respect to some examples may also be combined in other examples.
如本文中使用的,术语“包括”及其变型表示开放的术语,含义是“包括但不限于”。术语“基于”表示“至少部分地基于”。术语“一个实施例”和“一实施例”表示“至少一个实施例”。术语“另一个实施例”表示“至少一个其他实施例”。术语“第一”、“第二”等可以指代不同的或相同的对象。下面可以包括其他的定义,无论是明确的还是隐含的。除非上下文中明确地指明,否则一个术语的定义在整个说明书中是一致的。As used herein, the term "includes" and variations thereof represent an open term meaning "including, but not limited to." The term "based on" means "based at least in part on." The terms "one embodiment" and "an embodiment" mean "at least one embodiment." The term "another embodiment" means "at least one other embodiment". The terms "first", "second", etc. may refer to different or the same object. Other definitions may be included below, whether explicit or implicit. The definition of a term is consistent throughout this specification unless the context clearly dictates otherwise.
下面结合附图对本申请实施例进行详细说明。The embodiments of the present application will be described in detail below with reference to the accompanying drawings.
图1是根据本申请的一实施例的一种AI模型的部署方法的流程图,如图1所示,AI模型的部署方法100包括:Figure 1 is a flow chart of an AI model deployment method according to an embodiment of the present application. As shown in Figure 1, the AI model deployment method 100 includes:
步骤101,将AI模型文件解析为图,其中,图是由节点构成。Step 101: Parse the AI model file into a graph, where the graph is composed of nodes.
步骤102,针对图中的节点进行第一次分组,以使分组得到的多个子图中的每一子图满足边缘设备的硬件配置条件。Step 102: Group the nodes in the graph for the first time so that each of the multiple subgraphs obtained by grouping satisfies the hardware configuration conditions of the edge device.
可选地,边缘设备的硬件配置条件可以是例如随机存取存储器(Random Access Memory,RAM)空间或硬盘空间等。假设所选取进行比较的边缘设备的硬件配置条件是RAM空间,针对图中的节点进行第一次分组,以使分组得到的多个子图中的每一子图所占RAM空间等于或靠近边缘设备的RAM空间。Optionally, the hardware configuration condition of the edge device may be, for example, random access memory (Random Access Memory, RAM) space or hard disk space. Assume that the hardware configuration condition of the edge device selected for comparison is RAM space. Group the nodes in the graph for the first time so that the RAM space occupied by each subgraph in the grouped subgraphs is equal to or close to the edge device. of RAM space.
步骤103,仿真运行包含第一次分组信息的图。Step 103: Simulate the graph containing the first grouping information.
由于实际运行包含分组信息的图所占用的RAM空间是难以准确估算的,占用RAM空间的除了带有分组信息的图表示的AI模型本身之外,还包括相关优化器的缓存以及相关中间变量的存储,因此通过仿真运行的方式可以提前预判实际运行的结果。Since it is difficult to accurately estimate the RAM space occupied by actually running a graph containing grouping information, in addition to the AI model itself represented by the graph with grouping information, the RAM space occupied also includes the cache of the relevant optimizer and the related intermediate variables. Storage, so the actual running results can be predicted in advance through simulation running.
步骤104,当仿真运行所需要的硬件配置条件小于或等于边缘设备的硬件配置条件时,将包含第一次分组信息的图部署至边缘设备,边缘设备依次执行多个子图中的每一子图。Step 104: When the hardware configuration conditions required for simulation operation are less than or equal to the hardware configuration conditions of the edge device, deploy the graph containing the first grouping information to the edge device, and the edge device executes each of the multiple subgraphs in sequence. .
本申请实施例针对AI模型对应的图进行分组,以使得分组得到的每一个子图所占用的硬件配置条件满足边缘设备的相关硬件配置条件,再通过仿真运行进行测试,将相关满足要求的带有分组信息的图部署至边缘设备上。在不影响大型AI模型的准确度和精准度的条件下,通过本申请实施例可以将大型AI模型部署于边缘设备,从而大大拓宽了边缘设备的应用场景。The embodiment of this application groups the graphs corresponding to the AI model so that the hardware configuration conditions occupied by each subgraph obtained by the grouping meet the relevant hardware configuration conditions of the edge device, and then tests through simulation operations to group the relevant bands that meet the requirements. The graph with grouping information is deployed to the edge device. Without affecting the accuracy and accuracy of the large-scale AI model, the embodiments of the present application can deploy the large-scale AI model on edge devices, thereby greatly broadening the application scenarios of edge devices.
可选地,当仿真运行所需要的硬件配置条件大于边缘设备的硬件配置条件时,循环执行降低包含上一次分组信息的图对应的硬件配置条件,根据降低之后的硬件配置条件对图中的节点进行下一次分组,仿真运行包含当前分组信息的图,直至所运行的包含当前分组信息的图所需要的硬件配置条件小于或等于边缘设备的硬件配置条件,将包含当前分组信息的图部署至边缘设备。Optionally, when the hardware configuration conditions required for simulation operation are greater than the hardware configuration conditions of the edge device, a loop is executed to reduce the hardware configuration conditions corresponding to the graph containing the last grouping information, and the nodes in the graph are updated according to the reduced hardware configuration conditions. For the next grouping, simulate and run the graph containing the current grouping information until the hardware configuration conditions required for the running graph containing the current grouping information are less than or equal to the hardware configuration conditions of the edge device, and deploy the graph containing the current grouping information to the edge. equipment.
在一实施例中,当仿真运行所需要的RAM空间大于边缘设备的RAM空间时,循环执行降低包含上一次分组信息的图对应的RAM空间至预设比例,例如降低至包含上一次分组信息的图对应的RAM空间的90%,根据降低之后的RAM空间对图中的节点进行下一次分组,仿真运行包含当前分组信息的图,直至所运行的包含当前分组信息的图所需要的RAM空间小于或等于边缘设备的RAM空间,将包含当前分组信息的图部署至边缘设备。In one embodiment, when the RAM space required for simulation operation is greater than the RAM space of the edge device, the loop is executed to reduce the RAM space corresponding to the graph containing the last grouping information to a preset ratio, for example, to the RAM space containing the last grouping information. 90% of the RAM space corresponding to the graph, the nodes in the graph are grouped for the next time according to the reduced RAM space, and the graph containing the current grouping information is simulated until the RAM space required for the graph containing the current grouping information is less than Or equal to the RAM space of the edge device, deploy the graph containing the current grouping information to the edge device.
可选地,当仿真运行所需要的硬件配置条件大于边缘设备的硬件配置条件时,将运行包含第一次分组信息的图对应的硬件配置条件进行预设程度的降低,得到第一次降低后的硬件配置条件。以第一次降低后的硬件配置条件针对图中的节点进行第二次分组,运行包含第二次分组信息的图。当运行所需要的硬件配置条件小于或等于边缘设备的硬件配置条 件时,将包含第二次分组信息的图部署至边缘设备;当运行所需要的硬件配置条件大于边缘设备的硬件配置条件时,以此类推,直至所运行的当前分组信息的图所需要的硬件配置条件小于或等于边缘设备的硬件配置条件,将当前分组信息的图部署至边缘设备。Optionally, when the hardware configuration conditions required for simulation operation are greater than the hardware configuration conditions of the edge device, the hardware configuration conditions corresponding to the graph containing the first grouping information are reduced to a preset degree to obtain the result after the first reduction. hardware configuration conditions. Use the hardware configuration conditions after the first reduction to group the nodes in the graph for the second time, and run the graph containing the second grouping information. When the hardware configuration conditions required for operation are less than or equal to the hardware configuration conditions of the edge device, the graph containing the second grouping information is deployed to the edge device; when the hardware configuration conditions required for operation are greater than the hardware configuration conditions of the edge device, By analogy, until the hardware configuration conditions required by the running graph of the current grouping information are less than or equal to the hardware configuration conditions of the edge device, the graph of the current grouping information is deployed to the edge device.
本申请实施例为大型AI模型提供了端到端、闭环的边缘部署方法,只要仿真运行的结果不满足要求,便针对图进行重新分组以满足当前次设置的硬件配置条件,直到仿真运行结果满足要求后,完成实际部署。本申请实施例无需对AI模型进行一些修改或额外的训练,不仅提高了部署效率,同时也节省了人力资源。The embodiments of this application provide an end-to-end, closed-loop edge deployment method for large AI models. As long as the results of the simulation run do not meet the requirements, the graphs will be regrouped to meet the currently set hardware configuration conditions until the simulation run results meet the requirements. After the request is made, the actual deployment is completed. The embodiment of this application does not require any modification or additional training to the AI model, which not only improves deployment efficiency, but also saves human resources.
在一实施例中,提供了一种针对图中的节点进行分组的方法,从图中的至少一个输入节点开始,根据预设的规则遍历图中的节点,针对图中的节点进行分组。其中,预设的规则包括:图的连通性的规则,以及所有父节点均已访问过的相关节点和不存在父节点的相关节点优先访问的规则。其中,图的连通性的规则是指根据任意节点之间是否有边相连,来确定节点的遍历方式。可选地,当所有父节点均已访问过的相关节点优先访问的规则和不存在父节点的相关节点优先访问的规则存在冲突时,以所有父节点均已访问过的相关节点优先访问的规则为先。In one embodiment, a method for grouping nodes in a graph is provided, starting from at least one input node in the graph, traversing the nodes in the graph according to preset rules, and grouping the nodes in the graph. Among them, the preset rules include: graph connectivity rules, as well as the rules for priority access of related nodes that all parent nodes have visited and related nodes that do not have parent nodes. Among them, the rule of graph connectivity refers to determining the node traversal method based on whether there are edges connecting any nodes. Optionally, when there is a conflict between the rule that related nodes that have been visited by all parent nodes have priority access and the rule that related nodes that do not have parent nodes have priority access, the rule that related nodes that have all parent nodes have visited first will be used. First.
在一实施例中,图2是根据本申请一实施例的一种图中节点的分组方法的示意图,如图2所示,将图的至少一个输入节点,例如节点7和节点1依次添加至第一预设数组中,其中,第一预设数组用于存放所有待被访问的节点,且遵循后进先出的原则。输入节点是指图中不存在父节点的节点。后进先出的原则是指后添加进数组的节点会比先添加进数组的节点优先取出后经历访问。从第一预设数组中选择第一输入节点,例如节点1作为第一待访问节点,其中,第一输入节点为最后一个添加至第一预设数组中的节点。根据节点连通性,从第一待访问节点开始访问,将已访问的节点确定为第一子图中的节点。接着,将与第一待访问节点直接连接的子节点添加至第一预设数组中,例如将节点1直接连接的子节点,包括节点2、节点3和节点8添加至第一预设数组中。In one embodiment, Figure 2 is a schematic diagram of a method of grouping nodes in a graph according to an embodiment of the present application. As shown in Figure 2, at least one input node of the graph, such as node 7 and node 1, is sequentially added to In the first preset array, the first preset array is used to store all nodes to be accessed, and follows the last-in-first-out principle. An input node is a node in the graph that has no parent node. The last-in-first-out principle means that nodes added later to the array will be retrieved and accessed earlier than nodes added first. Select a first input node, such as node 1, from the first preset array as the first node to be accessed, where the first input node is the last node added to the first preset array. According to the node connectivity, the visit starts from the first node to be visited, and the visited node is determined as the node in the first subgraph. Then, add the child nodes directly connected to the first node to be visited to the first preset array. For example, add the child nodes directly connected to node 1, including node 2, node 3 and node 8, to the first preset array. .
当从第一预设数组中选择的第二节点的所有父节点均已访问过或不存在父节点时,例如图2中的节点2或节点3,将第二节点确定为第一子图中的节点。将第二节点的所有子节点添加至第一预设数组中。When all parent nodes of the second node selected from the first preset array have been visited or there is no parent node, such as node 2 or node 3 in Figure 2, the second node is determined to be in the first subgraph. node. Add all child nodes of the second node to the first preset array.
当从第一预设数组中选择的第三节点存在未被访问过的父节点时,例如图2中的节点8,将第三节点添加至第二预设数组中。当第三节点的所有父节点均已被访问过时,则将第三节点添加至第一预设数组中。可选地,在每一轮选择新的节点作为待访问节点时,均需要检查第二预设数组中的节点是否满足被添加至所述第一预设数组的条件,即是否其所有父节点均已被访问过,如该条件被满足,则将该节点从第二预设数组中移除并添加到第 一预设数组中。When the third node selected from the first preset array has an unvisited parent node, such as node 8 in Figure 2, the third node is added to the second preset array. When all parent nodes of the third node have been visited, the third node is added to the first preset array. Optionally, in each round of selecting a new node as the node to be visited, it is necessary to check whether the node in the second preset array meets the conditions for being added to the first preset array, that is, whether all its parent nodes have been visited, if the condition is met, the node is removed from the second preset array and added to the first preset array.
可选地,将节点1直接连接的子节点添加至第一预设数组中的步骤优先于节点7从第一预设数组中取出的步骤,即,当第一预设数组中只剩下一个待取出的节点时,优先将待添加的相关节点添加至第一预设数组中。Optionally, the step of adding the child nodes directly connected to node 1 to the first preset array takes precedence over the step of removing node 7 from the first preset array, that is, when only one node is left in the first preset array When a node is to be taken out, the relevant node to be added is first added to the first preset array.
通过本申请实施例针对图中的节点进行分组,在图中确定多个子图。本申请实施例提供的分组方法不涉及对AI模型进行任何额外的修改,因此不会影响AI模型的准确度和精准度。Through the embodiment of this application, nodes in the graph are grouped to determine multiple subgraphs in the graph. The grouping method provided by the embodiments of this application does not involve any additional modifications to the AI model, and therefore will not affect the accuracy and precision of the AI model.
图3是根据本申请的一实施例的一种AI模型的部署装置30的示意图,如图3所示,AI模型的部署装置30包括:Figure 3 is a schematic diagram of an AI model deployment device 30 according to an embodiment of the present application. As shown in Figure 3, the AI model deployment device 30 includes:
发送模块31,被配置为:将边缘设备的硬件配置条件分别发送至边缘设备仿真器32和图分割器33。The sending module 31 is configured to send the hardware configuration conditions of the edge device to the edge device emulator 32 and the graph segmenter 33 respectively.
图解析器34,被配置为:将AI模型文件解析为图,其中,图是由节点构成。The graph parser 34 is configured to: parse the AI model file into a graph, where the graph is composed of nodes.
图分割器33,被配置为:针对图中的节点进行第一次分组,以使分组得到的多个子图中的每一子图满足边缘设备的硬件配置条件。The graph divider 33 is configured to group the nodes in the graph for the first time so that each of the multiple subgraphs obtained by grouping satisfies the hardware configuration conditions of the edge device.
边缘设备仿真器32,被配置为:运行包含第一次分组信息的图,当运行所需要的硬件配置条件小于或等于边缘设备的硬件配置条件时,将包含第一次分组信息的图部署至边缘设备,以使边缘设备依次执行多个子图中的每一子图。The edge device emulator 32 is configured to: run the graph containing the first grouping information, and when the hardware configuration conditions required for operation are less than or equal to the hardware configuration conditions of the edge device, deploy the graph containing the first grouping information to An edge device such that the edge device executes each of the multiple subgraphs in turn.
可选地,当边缘设备仿真器32运行所需要的硬件配置条件大于边缘设备的硬件配置条件时,进入以下循环执行的步骤:边缘设备仿真器32将表征未通过的仿真结果以及仿真运行包含上一次分组信息的图对应的硬件配置条件发送至调整器35,调整器35降低包含上一次分组信息的图对应的硬件配置条件,图分割器33根据降低之后的硬件配置条件对图中的节点进行下一次分组,边缘设备仿真器32运行包含当前分组信息的图,直至所运行的包含当前分组信息的图所需要的硬件配置条件小于或等于边缘设备的硬件配置条件,将包含当前分组信息的图部署至边缘设备,以使边缘设备依次执行多个子图中的每一子图。Optionally, when the hardware configuration conditions required for the operation of the edge device emulator 32 are greater than the hardware configuration conditions of the edge device, the following steps of loop execution are entered: the edge device emulator 32 will characterize the failed simulation results and the simulation operation includes the above The hardware configuration conditions corresponding to the graph of the previous grouping information are sent to the adjuster 35. The adjuster 35 reduces the hardware configuration conditions corresponding to the graph containing the last grouping information. The graph divider 33 performs operations on the nodes in the graph based on the reduced hardware configuration conditions. For the next grouping, the edge device emulator 32 runs the graph containing the current grouping information until the hardware configuration conditions required for the run graph containing the current grouping information are less than or equal to the hardware configuration conditions of the edge device, and then the graph containing the current grouping information is Deploy to an edge device such that the edge device executes each of the multiple subgraphs in turn.
可选地,发送模块31,还被配置为:采集边缘设备的硬件配置条件。Optionally, the sending module 31 is also configured to: collect the hardware configuration conditions of the edge device.
在不影响大型AI模型的准确度和精准度的条件下,通过本申请实施例可以将大型AI模型部署于边缘设备,从而大大拓宽了边缘设备的应用场景。Without affecting the accuracy and accuracy of the large-scale AI model, the embodiments of the present application can deploy the large-scale AI model on edge devices, thereby greatly broadening the application scenarios of edge devices.
本申请实施例还提出一种电子设备400。图4是根据本申请的一实施例的一种设备控制平台400的示意图。如图4所示,设备控制平台400包括处理器402和存储器401,存储器401中存储有指令,其中指令被处理器402执行时实现如上文所述的方法100。An embodiment of the present application also provides an electronic device 400. Figure 4 is a schematic diagram of a device control platform 400 according to an embodiment of the present application. As shown in Figure 4, the device control platform 400 includes a processor 402 and a memory 401. Instructions are stored in the memory 401, and when the instructions are executed by the processor 402, the method 100 as described above is implemented.
其中,至少一个处理器402可以包括微处理器、专用集成电路(ASIC)、数字信号处理器(DSP)、中央处理单元(CPU)、图形处理单元(GPU)、状态机等。计算机可读介质的实施例包括但不限于软盘、CD-ROM、磁盘,存储器芯片、ROM、RAM、ASIC、配置的处理器、全光介质、所有磁带或其他磁性介质,或计算机处理器可以从中读取指令的任何其他介质。此外,各种其它形式的计算机可读介质可以向计算机发送或携带指令,包括路由器、专用或公用网络、或其它有线和无线传输设备或信道。指令可以包括任何计算机编程语言的代码,包括C、C++、C语言、Visual Basic、java和JavaScript。Wherein, at least one processor 402 may include a microprocessor, an application specific integrated circuit (ASIC), a digital signal processor (DSP), a central processing unit (CPU), a graphics processing unit (GPU), a state machine, etc. Examples of computer readable media include, but are not limited to, floppy disks, CD-ROMs, magnetic disks, memory chips, ROM, RAM, ASICs, configured processors, all-optical media, all magnetic tape or other magnetic media, or from which a computer processor can Any other medium from which instructions are read. Additionally, various other forms of computer-readable media can be used to send or carry instructions to a computer, including routers, private or public networks, or other wired and wireless transmission devices or channels. Instructions can include code in any computer programming language, including C, C++, C++, Visual Basic, Java, and JavaScript.
此外,本申请实施例实施例还提供一种计算机可读介质,该计算机可读介质上存储有计算机可读指令,计算机可读指令在被处理器执行时,使处理器执行前述的AI模型的部署方法。计算机可读介质的实施例包括软盘、硬盘、磁光盘、光盘(如CD-ROM、CD-R、CD-RW、DVD-ROM、DVD-RAM、DVD-RW、DVD+RW)、磁带、非易失性存储卡和ROM。可选地,可以由通信网络从服务器计算机上或云上下载计算机可读指令。In addition, the embodiments of the present application also provide a computer-readable medium. Computer-readable instructions are stored on the computer-readable medium. When executed by the processor, the computer-readable instructions cause the processor to execute the aforementioned AI model. Deployment method. Examples of computer-readable media include floppy disks, hard disks, magneto-optical disks, optical disks (such as CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD+RW), magnetic tape, non- Volatile memory cards and ROM. Alternatively, the computer-readable instructions may be downloaded from the server computer or the cloud by the communications network.
需要说明的是,上述各流程和各系统结构图中不是所有的步骤和模块都是必须的,可以根据实际的需要忽略某些步骤或模块。各步骤的执行顺序不是固定的,可以根据需要进行调整。上述各实施例中描述的系统结构可以是物理结构,也可以是逻辑结构,即,有些模块可能由同一物理实体实现,或者,有些模块可能分由多个物理实体实现,或者,可以由多个独立设备中的某些部件共同实现。It should be noted that not all steps and modules in the above-mentioned processes and system structure diagrams are necessary, and some steps or modules can be ignored according to actual needs. The execution order of each step is not fixed and can be adjusted as needed. The system structure described in the above embodiments may be a physical structure or a logical structure, that is, some modules may be implemented by the same physical entity, or some modules may be implemented by multiple physical entities, or may be implemented by multiple Some components in separate devices are implemented together.

Claims (11)

  1. 一种AI模型的部署方法,其特征在于,包括:A method of deploying an AI model, which is characterized by including:
    -将AI模型文件解析(101)为图,其中,所述图是由节点构成;- Parse (101) the AI model file into a graph, where the graph is composed of nodes;
    -针对所述图中的节点进行第一次分组(102),以使分组得到的多个子图中的每一子图满足所述边缘设备的硬件配置条件;-Perform the first grouping (102) of the nodes in the graph, so that each of the multiple subgraphs obtained by the grouping satisfies the hardware configuration conditions of the edge device;
    -仿真运行(103)所述包含第一次分组信息的图;- the diagram containing the first grouping information described in the simulation run (103);
    -当仿真运行所需要的硬件配置条件小于或等于所述边缘设备的硬件配置条件时,将所述包含第一次分组信息的图部署(104)至所述边缘设备,所述边缘设备依次执行所述多个子图中的每一子图。-When the hardware configuration conditions required for simulation operation are less than or equal to the hardware configuration conditions of the edge device, deploy (104) the graph containing the first grouping information to the edge device, and the edge device executes in sequence Each subgraph of the plurality of subgraphs.
  2. 根据权利要求1所述的方法,其特征在于,在运行(104)所述包含分组信息的图后,所述方法还包括:The method according to claim 1, characterized in that, after running (104) the graph containing grouping information, the method further includes:
    -当仿真运行所需要的硬件配置条件大于所述边缘设备的硬件配置条件时,循环执行降低包含上一次分组信息的图对应的硬件配置条件,根据降低之后的硬件配置条件对所述图中的节点进行下一次分组,仿真运行包含当前分组信息的图,直至所运行的包含当前分组信息的图所需要的硬件配置条件小于或等于所述边缘设备的硬件配置条件,将所述包含当前分组信息的图部署至所述边缘设备。-When the hardware configuration conditions required for simulation operation are greater than the hardware configuration conditions of the edge device, loop execution is performed to reduce the hardware configuration conditions corresponding to the graph containing the last grouping information, and the hardware configuration conditions in the graph are modified according to the reduced hardware configuration conditions. The node performs the next grouping and simulates running the graph containing the current grouping information until the hardware configuration conditions required for the running graph containing the current grouping information are less than or equal to the hardware configuration conditions of the edge device, and the graph containing the current grouping information is Deploy the graph to the edge device.
  3. 根据权利要求1所述的方法,其特征在于,所述边缘设备的硬件配置条件包括:RAM空间,或硬盘空间。The method according to claim 1, characterized in that the hardware configuration conditions of the edge device include: RAM space or hard disk space.
  4. 根据权利要求1所述的方法,其特征在于,以使分组得到的多个子图中的每一子图满足所述边缘设备的硬件配置条件包括:The method according to claim 1, characterized in that, making each of the plurality of subgraphs obtained by grouping satisfy the hardware configuration conditions of the edge device includes:
    -以使分组得到的多个子图中的每一子图所占RAM空间等于或靠近所述边缘设备的RAM空间。-So that the RAM space occupied by each subgraph in the grouped subgraphs is equal to or close to the RAM space of the edge device.
  5. 根据权利要求2所述的方法,其特征在于,在运行(104)所述包含分组信息的图后,所述方法还包括:The method according to claim 2, characterized in that, after running (104) the graph containing grouping information, the method further includes:
    -当仿真运行所需要的RAM空间大于所述边缘设备的RAM空间时,循环执行降低包含上一次分组信息的图对应的RAM空间至预设比例,根据降低之后的RAM空间对所述 图中的节点进行下一次分组,仿真运行包含当前分组信息的图,直至所运行的包含当前分组信息的图所需要的RAM空间小于或等于所述边缘设备的RAM空间,将所述包含当前分组信息的图部署至所述边缘设备。- When the RAM space required for simulation operation is greater than the RAM space of the edge device, loop execution is performed to reduce the RAM space corresponding to the graph containing the last grouping information to a preset ratio, and the RAM space in the graph is adjusted according to the reduced RAM space. The node performs the next grouping and simulates running the graph containing the current grouping information until the RAM space required for the running graph containing the current grouping information is less than or equal to the RAM space of the edge device, and the graph containing the current grouping information is Deploy to the edge device.
  6. 根据权利要求1或2所述的方法,其特征在于,针对所述图中的节点进行分组包括:The method according to claim 1 or 2, characterized in that, grouping nodes in the graph includes:
    -从所述图中的至少一个输入节点开始,根据预设的规则遍历图中的节点,针对所述图中的节点进行分组;其中,所述预设的规则包括:图的连通性的规则,以及所有父节点均已访问过的相关节点和不存在父节点的相关节点优先访问的规则。-Starting from at least one input node in the graph, traverse the nodes in the graph according to preset rules, and group the nodes in the graph; wherein the preset rules include: rules for the connectivity of the graph , as well as the rules for priority access of related nodes that all parent nodes have visited and related nodes that do not have parent nodes.
  7. 根据权利要求1或2所述的方法,其特征在于,针对所述图中的节点进行分组包括:The method according to claim 1 or 2, characterized in that, grouping nodes in the graph includes:
    -将所述图的至少一个输入节点添加至第一预设数组中,其中,所述第一预设数组用于存放所有待被访问的节点,且遵循后进先出的原则;-Add at least one input node of the graph to a first preset array, wherein the first preset array is used to store all nodes to be accessed and follows the last-in-first-out principle;
    -从所述第一预设数组中选择第一输入节点作为第一待访问节点,其中,所述第一输入节点为最后一个添加至所述第一预设数组中的节点;-Select the first input node from the first preset array as the first node to be visited, where the first input node is the last node added to the first preset array;
    -根据节点连通性,从所述第一待访问节点开始访问,已访问的节点确定为第一子图中的节点;-According to node connectivity, access starts from the first node to be visited, and the visited node is determined as a node in the first subgraph;
    -将与所述第一待访问节点直接连接的子节点添加至所述第一预设数组中,-Add child nodes directly connected to the first node to be visited to the first preset array,
    -当从所述第一预设数组中选择的第二节点的所有父节点均已访问过或不存在父节点时,将所述第二节点确定为第一子图中的节点;将所述第二节点的所有子节点添加至所述第一预设数组中;- When all parent nodes of the second node selected from the first preset array have been visited or there is no parent node, the second node is determined to be a node in the first subgraph; All child nodes of the second node are added to the first preset array;
    -当从所述第一预设数组中选择的第三节点存在未被访问过的父节点时,将所述第三节点添加至第二预设数组中;当所述第三节点的所有父节点均已被访问过时,则将所述第三节点添加至所述第一预设数组中;- When the third node selected from the first preset array has an unvisited parent node, add the third node to the second preset array; when all the parents of the third node If all nodes have been visited, add the third node to the first preset array;
    -针对所述图中的节点进行分组,在所述图中确定多个子图。- Grouping nodes in the graph, determining a plurality of subgraphs in the graph.
  8. 一种AI模型的部署装置,其特征在于,包括:An AI model deployment device, characterized by including:
    -发送模块(31),被配置为:将边缘设备的硬件配置条件分别发送至边缘设备仿真器(32)和图分割器(33);-The sending module (31) is configured to: send the hardware configuration conditions of the edge device to the edge device emulator (32) and the graph segmenter (33) respectively;
    -图解析器(34),被配置为:将AI模型文件解析为图,其中,所述图是由节点构成;- a graph parser (34) configured to: parse the AI model file into a graph, where the graph is composed of nodes;
    -所述图分割器(33),被配置为:针对所述图中的节点进行第一次分组,以使分组得到的多个子图中的每一子图满足所述边缘设备的硬件配置条件;-The graph divider (33) is configured to: perform the first grouping of nodes in the graph, so that each of the multiple subgraphs obtained by grouping satisfies the hardware configuration conditions of the edge device ;
    -所述边缘设备仿真器(32),被配置为:运行所述包含第一次分组信息的图;当运行所需要的硬件配置条件小于或等于所述边缘设备的硬件配置条件时,将所述包含第一次分组信息的图部署至所述边缘设备,以使所述边缘设备依次执行所述多个子图中的每一子图。-The edge device emulator (32) is configured to: run the graph containing the first grouping information; when the hardware configuration conditions required for operation are less than or equal to the hardware configuration conditions of the edge device, all The graph containing the first grouping information is deployed to the edge device, so that the edge device executes each of the plurality of subgraphs in sequence.
  9. 根据权利要求1所述的装置,其特征在于,The device according to claim 1, characterized in that:
    -当所述边缘设备仿真器(32)运行所需要的硬件配置条件大于所述边缘设备的硬件配置条件时,进入循环执行:- When the hardware configuration conditions required for the operation of the edge device emulator (32) are greater than the hardware configuration conditions of the edge device, enter a loop execution:
    -调整器(35)降低包含上一次分组信息的图对应的硬件配置条件,所述图分割器(33)根据降低之后的硬件配置条件对所述图中的节点进行下一次分组,所述边缘设备仿真器(32)运行包含当前分组信息的图,-The adjuster (35) reduces the hardware configuration conditions corresponding to the graph containing the last grouping information. The graph divider (33) groups the nodes in the graph for the next time according to the reduced hardware configuration conditions. The edge The device emulator (32) runs the graph containing the current grouping information,
    -直至所运行的包含当前分组信息的图所需要的硬件配置条件小于或等于所述边缘设备的硬件配置条件,将所述包含当前分组信息的图部署至所述边缘设备。-Until the hardware configuration conditions required by the running graph containing the current grouping information are less than or equal to the hardware configuration conditions of the edge device, deploy the graph containing the current grouping information to the edge device.
  10. 一种电子设备,其特征在于,包括:An electronic device, characterized by including:
    至少一个存储器(401),被配置为存储计算机可读代码;At least one memory (401) configured to store computer readable code;
    至少一个处理器(402),被配置为调用所述计算机可读代码,执行如权利要求1~7任一项所述的方法中的步骤。At least one processor (402) is configured to invoke the computer readable code to perform the steps of the method according to any one of claims 1 to 7.
  11. 一种计算机可读介质,其特征在于,所述计算机可读介质上存储有计算机可读指令,所述计算机可读指令在被处理器执行时,使所述处理器执行如权利要求1~7任一项所述的方法中的步骤。A computer-readable medium, characterized in that computer-readable instructions are stored on the computer-readable medium, and when executed by a processor, the computer-readable instructions cause the processor to execute claims 1 to 7 The steps in any of the methods.
PCT/CN2022/089379 2022-04-26 2022-04-26 Ai model deployment method and apparatus, and electronic device and computer-readable medium WO2023206097A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/089379 WO2023206097A1 (en) 2022-04-26 2022-04-26 Ai model deployment method and apparatus, and electronic device and computer-readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/089379 WO2023206097A1 (en) 2022-04-26 2022-04-26 Ai model deployment method and apparatus, and electronic device and computer-readable medium

Publications (1)

Publication Number Publication Date
WO2023206097A1 true WO2023206097A1 (en) 2023-11-02

Family

ID=88516541

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/089379 WO2023206097A1 (en) 2022-04-26 2022-04-26 Ai model deployment method and apparatus, and electronic device and computer-readable medium

Country Status (1)

Country Link
WO (1) WO2023206097A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109614238A (en) * 2018-12-11 2019-04-12 深圳市网心科技有限公司 A kind of recongnition of objects method, apparatus, system and readable storage medium storing program for executing
CN111240209A (en) * 2020-03-16 2020-06-05 广东工业大学 Adaptive configuration method and system for configuration dynamic control type optimal linkage response
US20210382754A1 (en) * 2021-06-12 2021-12-09 Intel Corporation Serverless computing architecture for artificial intelligence workloads on edge for dynamic reconfiguration of workloads and enhanced resource utilization
US20210390460A1 (en) * 2021-06-12 2021-12-16 Intel Corporation Compute and memory based artificial intelligence model partitioning using intermediate representation
CN113849314A (en) * 2021-09-30 2021-12-28 支付宝(杭州)信息技术有限公司 Data processing model deployment method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109614238A (en) * 2018-12-11 2019-04-12 深圳市网心科技有限公司 A kind of recongnition of objects method, apparatus, system and readable storage medium storing program for executing
CN111240209A (en) * 2020-03-16 2020-06-05 广东工业大学 Adaptive configuration method and system for configuration dynamic control type optimal linkage response
US20210382754A1 (en) * 2021-06-12 2021-12-09 Intel Corporation Serverless computing architecture for artificial intelligence workloads on edge for dynamic reconfiguration of workloads and enhanced resource utilization
US20210390460A1 (en) * 2021-06-12 2021-12-16 Intel Corporation Compute and memory based artificial intelligence model partitioning using intermediate representation
CN113849314A (en) * 2021-09-30 2021-12-28 支付宝(杭州)信息技术有限公司 Data processing model deployment method and device

Similar Documents

Publication Publication Date Title
EP4369180A2 (en) Callpath finder
EP3032425A1 (en) Integrated automated test case generation for safety-critical software
US20210365253A1 (en) Heterogeneity-agnostic and topology-agnostic data plane programming
CN111158656B (en) Test code generation method and device based on fruit tree method
WO2014093719A2 (en) Method, apparatus, and computer-readable medium for optimized data subsetting
EP1548581A2 (en) Methods, apparatus and programs for system development
CN110109816A (en) Test cases selection method and apparatus
CN111767217B (en) JS unit test case generation method and device
WO2023040372A1 (en) Ai modeling process choreography method and system based on graph algorithm
CN109710224A (en) Page processing method, device, equipment and storage medium
CN111355696A (en) Message identification method and device, DPI (deep packet inspection) equipment and storage medium
WO2023206097A1 (en) Ai model deployment method and apparatus, and electronic device and computer-readable medium
US8417489B2 (en) Duration estimation of repeated directed graph traversal
US9679092B1 (en) Constraint handling for parameterizable hardware description language
Celik et al. S-IDE: A tool framework for optimizing deployment architecture of High Level Architecture based simulation systems
US20220197881A1 (en) Multipath verification of data transforms in a system of systems
US9111032B2 (en) Identification of performance bottlenecks
KR102006212B1 (en) Method and apparatus for generating python script used in first simulator by converting xml script used in second simulator
JP2009265996A (en) Inspection device, verification method and verification program
US20190235865A1 (en) Solving constraint satisfaction problems comprising vectors of unknown size
JP2014228974A (en) Analysis method, analyzer and analysis program
US11635945B2 (en) Mobile application development device
US9626458B2 (en) Evaluation model generation device, evaluation model generation method, and evaluation model generation program
US11748075B2 (en) Two-phase application development device
KR102006211B1 (en) Method and apparatus for generating xml script used in first simulator by converting python script used in second simulator

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22938953

Country of ref document: EP

Kind code of ref document: A1