WO2022063247A1 - Neural architecture search method and apparatus - Google Patents

Neural architecture search method and apparatus Download PDF

Info

Publication number
WO2022063247A1
WO2022063247A1 PCT/CN2021/120434 CN2021120434W WO2022063247A1 WO 2022063247 A1 WO2022063247 A1 WO 2022063247A1 CN 2021120434 W CN2021120434 W CN 2021120434W WO 2022063247 A1 WO2022063247 A1 WO 2022063247A1
Authority
WO
WIPO (PCT)
Prior art keywords
network
neural network
super
loss function
delay
Prior art date
Application number
PCT/CN2021/120434
Other languages
French (fr)
Chinese (zh)
Inventor
李明阳
周振坤
徐羽琼
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022063247A1 publication Critical patent/WO2022063247A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

A neural architecture search method and apparatus, which relate to the field of AI and can use a small amount of computing resources to determine, in a short time, a neural network architecture with an excellent performance, while the consistency between a theoretical time delay and a true time delay is ensured. The method comprises: obtaining a super-network according to a target task; obtaining a time delay of each deep learning operator in the super-network running on an electronic device; determining a time-delay loss function according to the time delay of each deep learning operator running on the electronic device; executing a training operation on the super-network, and updating a model parameter of the super-network according to the time-delay loss function and a network loss function until an updated super-network satisfies a condition for the target task to run on the electronic device; and determining a target neural network architecture according to an updated framework parameter of each network layer, wherein the super-network comprises a plurality of network layers, each network layer comprises a plurality of nodes, any two nodes of a network layer are connected to each other by means of a deep learning operator, and the model parameter comprises a framework parameter of each network layer.

Description

神经网络结构搜索方法及装置Neural network structure search method and device
本申请要求于2020年09月28日提交国家知识产权局、申请号为202011043055.2、发明名称为“神经网络结构搜索方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202011043055.2 and the invention titled "Neural Network Structure Search Method and Device", which was submitted to the State Intellectual Property Office on September 28, 2020, the entire contents of which are incorporated herein by reference middle.
技术领域technical field
本申请涉及人工智能(artificial intelligence,AI)领域,尤其涉及一种神经网络结构搜索方法及装置。The present application relates to the field of artificial intelligence (AI), and in particular, to a method and apparatus for searching a neural network structure.
背景技术Background technique
随着AI技术的快速发展,各种神经网络模型层出不穷。神经网络结构的性能对神经网络模型的任务执行效果具有重要的影响。神经网络结构的性能越优,神经网络模型的任务执行效果越好。因此,构建神经网络模型时,如何确定性能优的神经网络结构是本领域技术人员的研究热点。With the rapid development of AI technology, various neural network models emerge in an endless stream. The performance of the neural network structure has an important influence on the task execution effect of the neural network model. The better the performance of the neural network structure, the better the task execution effect of the neural network model. Therefore, when building a neural network model, how to determine a neural network structure with excellent performance is a research hotspot for those skilled in the art.
神经网络结构搜索(neural architecture search,NAS)技术应用而生,NAS技术可以在预先定义的搜索空间中自动搜索到性能最优的神经网络结构。但是,现有技术中在使用NAS技术进行神经网络结构的搜索时,存在计算资源的耗费较大,且无法保证理论时延和真实时延一致的问题。The neural network architecture search (NAS) technology was born from the application of the NAS technology, which can automatically search for the neural network structure with the best performance in the pre-defined search space. However, in the prior art, when the NAS technology is used to search the neural network structure, there is a problem that the consumption of computing resources is relatively large, and the theoretical delay and the actual delay cannot be guaranteed to be consistent.
发明内容SUMMARY OF THE INVENTION
本申请提供一种神经网络结构搜索方法及装置,能够在保证理论时延和真实时延一致的情况下,利用较少的计算资源在较短的时间内确定性能优的神经网络结构。The present application provides a method and device for searching a neural network structure, which can determine a neural network structure with excellent performance in a short period of time with less computing resources under the condition that the theoretical delay and the actual delay are guaranteed to be consistent.
第一方面,本申请提供一种神经网络结构搜索方法,神经网络结构搜索装置根据目标任务获取超网络,并获取超网络中的每个深度学习算子在电子设备运行的时延,且根据每个深度学习算子在电子设备运行的时延,确定超网络的时延损失函数,之后对超网络执行训练操作,根据时延损失函数和训练过程中获取到的网络损失函数,更新超网络的模型参数,直到更新后的超网络满足目标任务在电子设备上运行的条件,根据更新后的每个网络层的架构参数,确定目标神经网络结构。其中,超网络包括多个网络层,每个网络层包括多个节点,一个网络层的任意两个节点之间通过深度学习算子连接,模型参数包括多个网络层中每个网络层的架构参数。In the first aspect, the present application provides a neural network structure search method. The neural network structure search device obtains a super network according to a target task, and obtains the running delay of each deep learning operator in the super network in the electronic device, and according to each The delay of a deep learning operator running on the electronic device determines the delay loss function of the super network, and then performs training operations on the super network. According to the delay loss function and the network loss function obtained in the training process, the super network Model parameters until the updated super-network satisfies the conditions for the target task to run on the electronic device, and the target neural network structure is determined according to the updated architecture parameters of each network layer. Among them, the super network includes multiple network layers, each network layer includes multiple nodes, any two nodes of a network layer are connected by a deep learning operator, and the model parameters include the architecture of each network layer in the multiple network layers parameter.
这样,根据目标任务获取的超网络是一个包括多个网络层,每个网络层包括多个节点,任意两个节点之间通过深度学习算子连接的结构,该超网络包含所有可能用于执行目标任务的子网络。本申请实施例通过训练超网络,更新超网络的模型参数,该模型参数包括每个网络层的架构参数,直到更新后的超网络满足条件为止,便可以根据更新后的每个网络层的架构参数确定目标神经网络结构,即确定出性能最优的神经网络结构,与现有技术中的训练大量的子网络才能得到目标神经网络结构相比,由于本申请实施例仅需要训练超网络便可以得到目标神经网络结构,因此节省了大量的计算资源,缩短了搜索时间,提高了搜索效率。且,由于在更新模型参数时参考的是时 延损失函数,该时延损失函数是通过根据每个深度学习算子在电子设备上运行的真实时延得到的,能够实现在确定目标神经网络结构时,保证理论时延和真实时延的一致。In this way, the supernetwork obtained according to the target task is a structure that includes multiple network layers, each network layer includes multiple nodes, and any two nodes are connected by deep learning operators. Subnet of the target task. The embodiment of the present application updates the model parameters of the super network by training the super network. The model parameters include the architecture parameters of each network layer. Until the updated super network satisfies the conditions, the updated architecture of each network layer can be The parameters determine the target neural network structure, that is, determine the neural network structure with the best performance. Compared with the training of a large number of sub-networks in the prior art, the target neural network structure can be obtained because the embodiment of the present application only needs to train the super network. The target neural network structure is obtained, so a lot of computing resources are saved, the search time is shortened, and the search efficiency is improved. Moreover, since the delay loss function is referenced when updating the model parameters, the delay loss function is obtained according to the real delay of each deep learning operator running on the electronic device, which can realize the determination of the target neural network structure. When , the theoretical delay and the real delay are guaranteed to be consistent.
可选的,在本申请的一种可能的实现方式中,上述“根据每个深度学习算子在电子设备运行的时延,确定超网络的时延损失函数”的方法可以包括:神经网络结构搜索装置根据预存的算子与网络嵌入系数的对应关系,确定每个深度学习算子对应的网络嵌入系数,确定每个深度学习算子在电子设备运行的时延与该深度学习算子对应的网络嵌入系数的乘积,并确定所有乘积的和值,之后根据和值和时延一致性系数,确定时延损失函数。Optionally, in a possible implementation manner of the present application, the above-mentioned method of "determining the delay loss function of the super network according to the delay of each deep learning operator running on the electronic device" may include: a neural network structure. The search device determines the network embedding coefficient corresponding to each deep learning operator according to the corresponding relationship between the pre-stored operator and the network embedding coefficient, and determines the delay of each deep learning operator running on the electronic device and the corresponding deep learning operator. The network embeds the product of the coefficients, and determines the sum of all products, and then determines the delay loss function according to the sum and the delay consistency coefficient.
这样,通过根据每个深度学习算子在电子设备上运行的真实时延,以及每个深度学习算子对应的网络嵌入系数,实现了将离散的深度学习算子对应的时延构建成连续的时延约束函数,从而保证时延一致性。In this way, according to the real time delay of each deep learning operator running on the electronic device and the network embedding coefficient corresponding to each deep learning operator, the time delay corresponding to the discrete deep learning operator is constructed into a continuous one. Delay constraint function to ensure delay consistency.
可选的,在本申请的另一种可能的实现方式中,网络层的架构参数包括该网络层的每个深度学习算子的连接权重。在该情况下,上述“根据更新后的每个网络层的架构参数,确定目标神经网络结构”的方法可以包括:神经网络结构搜索装置获取更新后的每个网络层的架构参数中,数值满足预设条件的连接权重,并根据获取到的所有连接权重,确定目标神经网络结构。Optionally, in another possible implementation manner of the present application, the architectural parameters of the network layer include the connection weight of each deep learning operator of the network layer. In this case, the above-mentioned method of “determining the target neural network structure according to the updated architecture parameters of each network layer” may include: the neural network structure search device obtains the updated architecture parameters of each network layer, and the numerical value satisfies The connection weights of the preset conditions are determined, and the target neural network structure is determined according to all the obtained connection weights.
由于现有技术中是根据每个网络层的架构参数中,最大的一个连接权重确定目标神经网络结构的,而本申请中是根据每个网络层的架构参数中,数值满足预设条件的连接权重确定目标神经网络结构的。本申请中并未限定每个网络层的连接权重的数量。当从每个网络层获取的连接权重的数量为多个时,与现有技术中的一个相比,保留的连接权重的数量越多,得到的目标神经网络结构越稳定,从而根据该目标神经网络结构确定的神经网络模型的任务执行效果越好。In the prior art, the target neural network structure is determined according to the largest connection weight among the architectural parameters of each network layer, while in the present application, the connections whose values satisfy the preset conditions are based on the architectural parameters of each network layer. The weights determine the target neural network structure. The number of connection weights of each network layer is not limited in this application. When the number of connection weights obtained from each network layer is multiple, compared with one in the prior art, the larger the number of reserved connection weights, the more stable the obtained target neural network structure, so that according to the target neural network The task execution effect of the neural network model determined by the network structure is better.
可选的,在本申请的另一种可能的实现方式中,上述“根据时延损失函数和训练过程中获取到的网络损失函数,更新超网络的模型参数”的方法可以包括:神经网络结构搜索装置根据时延损失函数和网络损失函数,确定超网络的整体损失函数,并根据整体损失函数,更新超网络的模型参数。Optionally, in another possible implementation manner of the present application, the above-mentioned method of "update the model parameters of the super network according to the delay loss function and the network loss function obtained in the training process" may include: a neural network structure. The search device determines the overall loss function of the super network according to the delay loss function and the network loss function, and updates the model parameters of the super network according to the overall loss function.
这样,通过根据时延损失函数和网络损失函数,更新超网络中的模型参数,保证目标神经网络结构满足时延一致性和网络精度要求。In this way, by updating the model parameters in the hypernetwork according to the delay loss function and the network loss function, it is ensured that the target neural network structure meets the requirements of delay consistency and network accuracy.
可选的,在本申请的另一种可能的实现方式中,上述“根据整体损失函数,更新超网络的模型参数”的方法可以包括:神经网络结构搜索装置根据整体损失函数,确定每个模型参数的梯度信息,根据每个模型参数的梯度信息,调整该模型参数。其中,梯度信息用于表示对应的模型参数的调节系数。Optionally, in another possible implementation manner of the present application, the above-mentioned method of "update the model parameters of the super network according to the overall loss function" may include: the neural network structure search device determines each model according to the overall loss function. The gradient information of the parameters, according to the gradient information of each model parameter, adjust the model parameters. Among them, the gradient information is used to represent the adjustment coefficient of the corresponding model parameter.
实现了通过梯度来更新模型参数。Implemented updating model parameters through gradients.
第二方面,提供一种神经网络结构搜索装置,该神经网络结构搜索装置包括用于执行上述第一方面或上述第一方面的任一种可能的实现方式的神经网络结构搜索方法的各个模块。In a second aspect, a neural network structure search apparatus is provided. The neural network structure search apparatus includes various modules for executing the neural network structure search method of the first aspect or any possible implementation manner of the first aspect.
第三方面,提供一种神经网络结构搜索装置,该神经网络结构搜索装置包括存储器和处理器。存储器和处理器耦合。存储器用于存储计算机程序代码,计算机程序代码包括计算机指令。当处理器执行计算机指令时,神经网络结构搜索装置执行如第一 方面及其任一种可能的实现方式的神经网络结构搜索方法。In a third aspect, a neural network structure search apparatus is provided, and the neural network structure search apparatus includes a memory and a processor. The memory and the processor are coupled. The memory is used to store computer program code including computer instructions. When the processor executes the computer instructions, the neural network structure search apparatus executes the neural network structure search method as in the first aspect and any possible implementations thereof.
第四方面,提供一种芯片系统,该芯片系统应用于神经网络结构搜索装置。芯片系统包括一个或多个接口电路,以及一个或多个处理器。接口电路和处理器通过线路互联;接口电路用于从神经网络结构搜索装置的存储器接收信号,并向处理器发送信号,信号包括存储器中存储的计算机指令。当处理器执行计算机指令时,神经网络结构搜索装置执行如第一方面及其任一种可能的实现方式的神经网络结构搜索方法。In a fourth aspect, a chip system is provided, and the chip system is applied to a neural network structure search apparatus. A chip system includes one or more interface circuits, and one or more processors. The interface circuit and the processor are interconnected by lines; the interface circuit is used for receiving signals from the memory of the neural network structure search device and sending signals to the processor, the signals including computer instructions stored in the memory. When the processor executes the computer instructions, the neural network structure search apparatus executes the neural network structure search method according to the first aspect and any possible implementations thereof.
第五方面,提供一种计算机可读存储介质,该计算机可读存储介质包括计算机指令,当计算机指令在神经网络结构搜索装置上运行时,使得神经网络结构搜索装置执行如第一方面及其任一种可能的实现方式的神经网络结构搜索方法。In a fifth aspect, a computer-readable storage medium is provided, the computer-readable storage medium comprising computer instructions, when the computer instructions are executed on the neural network structure search device, the neural network structure search device is made to perform the first aspect and any of the above. A possible implementation of the neural network structure search method.
第六方面,本申请提供一种计算机程序产品,该计算机程序产品包括计算机指令,当计算机指令在神经网络结构搜索装置上运行时,使得神经网络结构搜索装置执行如第一方面及其任一种可能的实现方式的神经网络结构搜索方法。In a sixth aspect, the present application provides a computer program product, the computer program product comprising computer instructions, when the computer instructions are run on the neural network structure search device, the neural network structure search device is made to perform the first aspect and any one thereof. Possible implementations of neural network architecture search methods.
本申请中第二方面到第六方面及其各种实现方式的具体描述,可以参考第一方面及其各种实现方式中的详细描述;并且,第二方面到第六方面及其各种实现方式的有益效果,可以参考第一方面及其各种实现方式中的有益效果分析,此处不再赘述。For specific descriptions of the second to sixth aspects and their various implementations in this application, reference may be made to the detailed descriptions in the first aspect and their various implementations; and, for the second to sixth aspects and their various implementations For the beneficial effect of the method, reference may be made to the analysis of the beneficial effect in the first aspect and its various implementation manners, which will not be repeated here.
本申请的这些方面或其他方面在以下的描述中会更加简明易懂。These and other aspects of the present application will be more clearly understood from the following description.
附图说明Description of drawings
图1为本申请实施例提供的神经网络结构搜索系统的一种结构示意图;1 is a schematic structural diagram of a neural network structure search system provided by an embodiment of the present application;
图2为本申请实施例提供的计算装置的一种结构示意图;FIG. 2 is a schematic structural diagram of a computing device provided by an embodiment of the present application;
图3为本申请实施例提供的神经网络结构搜索方法的流程示意图之一;3 is one of the schematic flowcharts of the neural network structure search method provided by the embodiment of the present application;
图4为本申请实施例提供的超网络的结构示意图;4 is a schematic structural diagram of a super network provided by an embodiment of the present application;
图5为本申请实施例提供的神经网络结构搜索方法的流程示意图之二;5 is a second schematic flowchart of a method for searching a neural network structure provided by an embodiment of the present application;
图6为本申请实施例提供的神经网络结构搜索方法的流程示意图之三;FIG. 6 is a third schematic flowchart of a method for searching a neural network structure provided by an embodiment of the present application;
图7为本申请实施例提供的神经网络结构搜索装置的结构示意图。FIG. 7 is a schematic structural diagram of a neural network structure search apparatus provided by an embodiment of the present application.
具体实施方式detailed description
在本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。In the embodiments of the present application, words such as "exemplary" or "for example" are used to represent examples, illustrations or illustrations. Any embodiments or designs described in the embodiments of the present application as "exemplary" or "such as" should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present the related concepts in a specific manner.
以下,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本申请实施例的描述中,除非另有说明,“多个”的含义是两个或两个以上。Hereinafter, the terms "first" and "second" are only used for descriptive purposes, and should not be construed as indicating or implying relative importance or implicitly indicating the number of indicated technical features. Thus, a feature defined as "first" or "second" may expressly or implicitly include one or more of that feature. In the description of the embodiments of the present application, unless otherwise specified, "plurality" means two or more.
目前,神经网络模型的构建过程为:构建神经网络结构,并对构建的神经网络结构进行训练及评估,以获得性能优的神经网络结构,并根据该性能优的神经网络结构确定神经网络模型。At present, the construction process of a neural network model is as follows: constructing a neural network structure, training and evaluating the constructed neural network structure to obtain a neural network structure with excellent performance, and determining a neural network model according to the neural network structure with excellent performance.
现有的大部分神经网络结构是人工设计的。例如,在图像分类任务上大放异彩的ResNet、在机器翻译任务上称霸的Transformer等网络结构是由本领域的专家设计的。但是,网络结构的设计是专家根据丰富的经验和大量的实验得到的,存在着耗费时间 长、准确率低、时延不一致的问题。其中,时延不一致指的是神经网络模型的理论时延和真实时延不一致,真实时延指的是神经网络模型在电子设备上实际运行的时延。Most of the existing neural network structures are artificially designed. For example, network structures such as ResNet, which shines on image classification tasks, and Transformer, which dominates machine translation tasks, are designed by experts in the field. However, the design of the network structure is obtained by experts based on rich experience and a large number of experiments, and there are problems such as long time consumption, low accuracy and inconsistent delay. Among them, the inconsistency of the time delay refers to the inconsistency between the theoretical time delay of the neural network model and the real time delay, and the real time delay refers to the time delay of the actual operation of the neural network model on the electronic device.
NAS技术可以在预先定义的搜索空间内自动搜索到性能优的神经网络结构,从而解决人工设计神经网络结构存在的问题。NAS technology can automatically search for a neural network structure with excellent performance in a predefined search space, so as to solve the problem of artificially designed neural network structure.
在现有技术一的方案中,采用强化学习技术进行神经网络结构的搜索。具体的,神经网络结构搜索装置可以使用循环神经网络(recurrent neural network,RNN)作为控制器,根据预设的搜索空间,使用控制器参数采样生成子网络。训练该子网络收敛得到模型评估指标,如子网络的准确率、每秒运行的浮点运算次数(floating-point operations per second,FLOPs)等。之后可以根据模型评估指标,更新控制器参数。然后,神经网络结构搜索装置可以重复执行上述操作,即根据搜索空间,使用更新的控制器参数采样生成另一子网络,并训练该另一子网络得到新的模型评估指标,根据新的模型评估指标更新上一次更新后的控制器参数。循环往复,直到得到性能优的子网络为止,将该子网络作为待确定的神经网络模型的网络结构。In the solution of the prior art, the reinforcement learning technology is used to search the neural network structure. Specifically, the neural network structure search device may use a recurrent neural network (RNN) as a controller, and use the controller parameters to sample and generate a sub-network according to a preset search space. Train the sub-network to converge to obtain model evaluation indicators, such as the accuracy of the sub-network, the number of floating-point operations per second (FLOPs) running per second, and so on. The controller parameters can then be updated based on the model evaluation metrics. Then, the neural network structure search device can repeatedly perform the above operations, that is, according to the search space, use the updated controller parameters to sample and generate another sub-network, and train the other sub-network to obtain a new model evaluation index, and evaluate according to the new model. The indicator updates the controller parameters after the last update. The cycle is repeated until a sub-network with excellent performance is obtained, and the sub-network is used as the network structure of the neural network model to be determined.
但是,由于神经网络结构搜索装置需要训练大量的子网络才能得到性能最优的子网络,每次训练子网络均需要初始化网络权重,导致计算资源耗费较大。且,由于神经网络结构搜索装置在更新控制器参数时,参考的是FLOPs,该FLOPs无法反映子网络在不同的电子设备上的真实时延,导致无法保证子网络的理论时延和真实时延的一致。However, because the neural network structure search device needs to train a large number of sub-networks to obtain the sub-network with the best performance, the network weights need to be initialized each time the sub-network is trained, resulting in a large consumption of computing resources. Moreover, since the neural network structure search device refers to FLOPs when updating the controller parameters, the FLOPs cannot reflect the real delay of the sub-network on different electronic devices, resulting in the inability to guarantee the theoretical and real delays of the sub-network. consistent.
在现有技术二的方案中,采用进化算法和强化学习技术进行神经网络结构的搜索。该方案中的搜索过程,在现有技术一的方案的基础上增加了神经网络结构搜索装置向电子设备发送子网络,并接收电子设备返回的运行该子网络的真实时延。这样,神经网络结构搜索装置在更新控制器参数时,便可以参考该真实时延,而不是FLOPs,解决了现有技术一的方案中的时延不一致问题。In the solution of the prior art 2, an evolutionary algorithm and a reinforcement learning technology are used to search the neural network structure. The search process in this solution adds a neural network structure search device to the electronic device to send the sub-network to the electronic device on the basis of the solution of the prior art, and receives the real time delay of running the sub-network returned by the electronic device. In this way, the neural network structure search device can refer to the real time delay instead of the FLOPs when updating the controller parameters, which solves the problem of time delay inconsistency in the solution of the prior art.
但是,现有技术二的方案中仍然存在计算资源消耗较大的问题,且将子网络发送至电子设备也会增加大量的计算资源,从而导致神经网络结构的搜索效率较低。However, the solution of the prior art still has the problem of large consumption of computing resources, and sending the sub-network to the electronic device will also increase a large amount of computing resources, resulting in low search efficiency of the neural network structure.
综上可得,现有技术中的神经网络结构搜索,存在计算资源的耗费较大,且无法保证理论时延和真实时延一致的问题。To sum up, the neural network structure search in the prior art has the problems that the consumption of computing resources is relatively large, and the theoretical delay and the actual delay cannot be guaranteed to be consistent.
为了能够在保证理论时延和真实时延一致的情况下,利用较少的计算资源在较短的时间内确定性能优的神经网络结构,本申请实施例提供了一种神经网络结构搜索方法,通过根据目标任务获取超网络,并根据超网络中的每个深度学习算子在电子设备上运行的时延,确定超网络的时延损失函数。在训练该超网络的过程中,根据时延损失函数和网络损失函数,更新超网络的模型参数,直到更新后的超网络满足目标任务在电子设备上运行的条件,最后根据更新后的每个网络层的架构参数,确定目标神经网络结构。这样,根据目标任务获取的超网络是一个包括多个网络层,每个网络层包括多个节点,任意两个节点之间通过深度学习算子连接的结构,该超网络包含所有可能用于执行目标任务的子网络。本申请实施例通过训练超网络,更新超网络的模型参数,该模型参数包括每个网络层的架构参数,直到更新后的超网络满足条件为止,便可以根据更新后的每个网络层的架构参数确定目标神经网络结构,即确定出性能最优的神经网络结构,与现有技术中的训练大量的子网络才能得到目标神经网络结构相比, 由于本申请实施例仅需要训练超网络便可以得到目标神经网络结构,因此节省了大量的计算资源,缩短了搜索时间,提高了搜索效率。且,由于在更新模型参数时参考的是时延损失函数,该时延损失函数是通过根据每个深度学习算子在电子设备上运行的真实时延得到的,能够实现在确定目标神经网络结构时,保证理论时延和真实时延的一致。In order to use less computing resources to determine a neural network structure with excellent performance in a short time under the condition that the theoretical delay and the real delay are guaranteed to be consistent, the embodiment of the present application provides a neural network structure search method, By obtaining the super network according to the target task, and according to the delay of each deep learning operator in the super network running on the electronic device, the delay loss function of the super network is determined. In the process of training the super network, the model parameters of the super network are updated according to the delay loss function and the network loss function until the updated super network satisfies the conditions for the target task to run on the electronic device, and finally according to the updated The architectural parameters of the network layer determine the target neural network structure. In this way, the supernetwork obtained according to the target task is a structure that includes multiple network layers, each network layer includes multiple nodes, and any two nodes are connected by deep learning operators. Subnet of the target task. The embodiment of the present application updates the model parameters of the super network by training the super network. The model parameters include the architecture parameters of each network layer. Until the updated super network satisfies the conditions, the updated architecture of each network layer can be The parameters determine the target neural network structure, that is, determine the neural network structure with the best performance. Compared with the training of a large number of sub-networks in the prior art, the target neural network structure can be obtained because the embodiment of the present application only needs to train the super network. The target neural network structure is obtained, so a lot of computing resources are saved, the search time is shortened, and the search efficiency is improved. Moreover, since the delay loss function is referenced when updating the model parameters, the delay loss function is obtained according to the real delay of each deep learning operator running on the electronic device, which can realize the determination of the target neural network structure. When , the theoretical delay and the real delay are guaranteed to be consistent.
本申请实施例提供的神经网络结构搜索方法的执行主体为神经网络结构搜索装置。The execution body of the neural network structure search method provided by the embodiment of the present application is a neural network structure search apparatus.
在一种场景中,神经网络结构搜索装置可以为电子设备,该电子设备可以为服务器或者终端设备。也就是说,电子设备自身发起目标任务,通过执行本申请实施例提供的神经网络结构搜索方法来确定出性能最优的目标神经网络结构,从而确定神经网络模型。之后,电子设备运行该神经网络模型,以执行目标任务。In one scenario, the neural network structure search apparatus may be an electronic device, and the electronic device may be a server or a terminal device. That is, the electronic device itself initiates the target task, and determines the target neural network structure with the best performance by executing the neural network structure search method provided in the embodiment of the present application, thereby determining the neural network model. After that, the electronic device runs the neural network model to perform the target task.
在另一种场景中,神经网络结构搜索装置可以为服务器,运行神经网络模型的为终端设备。也就是说,服务器通过执行本申请实施例提供的神经网络结构搜索方法来确定出性能最优的目标神经网络结构,从而确定神经网络模型,并将神经网络模型发送至终端设备。终端设备运行接收到的该神经网络模型,以执行目标任务。具体的,本申请实施例提供的神经网络结构搜索方法可以适用于神经网络结构搜索系统。In another scenario, the neural network structure search device may be a server, and the terminal device running the neural network model may be used. That is, the server determines the target neural network structure with the best performance by executing the neural network structure search method provided in the embodiment of the present application, thereby determining the neural network model, and sending the neural network model to the terminal device. The terminal device runs the received neural network model to perform the target task. Specifically, the neural network structure search method provided in the embodiment of the present application may be applicable to a neural network structure search system.
图1示出了该神经网络结构搜索系统的一种结构。如图1所示,该神经网络结构搜索系统可以包括:服务器11和终端设备12。服务器11和终端设备12采用有线通信方式或无线通信方式建立连接。FIG. 1 shows a structure of the neural network structure search system. As shown in FIG. 1 , the neural network structure search system may include: a server 11 and a terminal device 12 . The connection between the server 11 and the terminal device 12 is established through wired communication or wireless communication.
服务器11,是本申请实施例提供的神经网络结构搜索方法的执行主体。主要用于训练超网络,并根据时延损失函数和网络损失函数更新超网络中的模型参数,直到更新后的超网络满足目标任务在终端设备12上运行的条件。还用于根据更新后的每个网络层的架构参数,确定目标神经网络结构,从而确定神经网络模型,并将该神经网络模型发送至终端设备12。The server 11 is the execution body of the neural network structure search method provided by the embodiment of the present application. It is mainly used to train the super network, and update the model parameters in the super network according to the delay loss function and the network loss function, until the updated super network meets the conditions for the target task to run on the terminal device 12 . It is also used to determine the target neural network structure according to the updated architecture parameters of each network layer, so as to determine the neural network model, and send the neural network model to the terminal device 12 .
在一些实施例中,服务器11,可以是一台服务器,也可以是由多台服务器组成的服务器集群,还可以一个云计算服务中心。本申请实施例在此对服务器的具体形式不做限定,图1中以一台服务器为例示出。In some embodiments, the server 11 may be a single server, a server cluster composed of multiple servers, or a cloud computing service center. This embodiment of the present application does not limit the specific form of the server, and FIG. 1 takes one server as an example to illustrate.
终端设备12,用于运行来自服务器11的神经网络模型,以执行目标任务。The terminal device 12 is used to run the neural network model from the server 11 to perform the target task.
在一些实施例中,终端设备12可以为:手机(mobile phone)、平板电脑、笔记本电脑、掌上电脑、移动互联网设备(mobile internet device,MID)、可穿戴设备,虚拟现实(virtual reality,VR)设备、增强现实(augmented reality,AR)设备、工业控制(industrial control中的无线终端、无人驾驶(self driving)中的无线终端、远程手术(remote medical surgery)中的无线终端、智能电网(smart grid)中的无线终端、运输安全(transportation safety)中的无线终端、智慧城市(smart city)中的无线终端、智慧家庭(smart home)中的无线终端、物联网(internet of things,IOT)设备等。本申请实施例在此对终端设备的具体形式不做限定,图1中以终端设备12为手机为例示出。In some embodiments, the terminal device 12 may be: a mobile phone (mobile phone), a tablet computer, a notebook computer, a handheld computer, a mobile internet device (MID), a wearable device, and a virtual reality (VR) equipment, augmented reality (AR) equipment, wireless terminals in industrial control (wireless terminals in industrial control), wireless terminals in self-driving (self driving), wireless terminals in remote medical surgery (remote medical surgery), smart grids (smart grids) wireless terminals in grid), wireless terminals in transportation safety, wireless terminals in smart cities, wireless terminals in smart homes, internet of things (IOT) devices etc. The specific form of the terminal device is not limited in this embodiment of the present application, and FIG. 1 takes the terminal device 12 as a mobile phone as an example for illustration.
本申请实施例在此对神经网络结构搜索方法具体应用于哪一种场景中不做限定。This embodiment of the present application does not limit the specific scenario in which the neural network structure search method is applied.
上述服务器11和终端设备12的基本硬件结构类似,都包括图2所示计算装置所包括的元件。下面以图2所示的计算装置为例,介绍服务器11和终端设备12的硬件 结构。The basic hardware structures of the above-mentioned server 11 and terminal device 12 are similar, and both include the elements included in the computing apparatus shown in FIG. 2 . The hardware structures of the server 11 and the terminal device 12 are described below by taking the computing device shown in Fig. 2 as an example.
如图2所示,计算装置可以包括处理器21,存储器22、通信接口23、总线24。处理器21,存储器22以及通信接口23之间可以通过总线24连接。As shown in FIG. 2 , the computing device may include a processor 21 , a memory 22 , a communication interface 23 , and a bus 24 . The processor 21 , the memory 22 and the communication interface 23 can be connected through a bus 24 .
处理器21是计算装置的控制中心,可以是一个处理器,也可以是多个处理元件的统称。例如,处理器21可以是一个通用中央处理单元(central processing unit,CPU),也可以是其他通用处理器等。其中,通用处理器可以是微处理器或者是任何常规的处理器等,例如,通用处理器可以是图形处理器(graphics processing unit,GPU)、数字信号处理器(digital signal processing,DSP)等。The processor 21 is the control center of the computing device, and may be a processor or a general term for multiple processing elements. For example, the processor 21 may be a general-purpose central processing unit (central processing unit, CPU), or may be other general-purpose processors or the like. The general-purpose processor may be a microprocessor or any conventional processor, for example, the general-purpose processor may be a graphics processor (graphics processing unit, GPU), a digital signal processor (digital signal processing, DSP), and the like.
作为一种实施例,处理器21可以包括一个或多个CPU,例如图2中所示的CPU 0和CPU 1。As an example, the processor 21 may include one or more CPUs, such as CPU 0 and CPU 1 shown in FIG. 2 .
存储器22可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。The memory 22 may be read-only memory (ROM) or other type of static storage device that can store static information and instructions, random access memory (RAM) or other type of static storage device that can store information and instructions A dynamic storage device that can also be an electrically erasable programmable read-only memory (EEPROM), a magnetic disk storage medium, or other magnetic storage device, or can be used to carry or store instructions or data structures in the form of desired program code and any other medium that can be accessed by a computer, but is not limited thereto.
一种可能的实现方式中,存储器22可以独立于处理器21存在,存储器22可以通过总线24与处理器21相连接,用于存储指令或者程序代码。处理器21调用并执行存储器22中存储的指令或程序代码时,能够实现本申请下述实施例提供的神经网络结构搜索方法。In a possible implementation manner, the memory 22 may exist independently of the processor 21, and the memory 22 may be connected to the processor 21 through a bus 24 for storing instructions or program codes. When the processor 21 calls and executes the instructions or program codes stored in the memory 22, it can implement the neural network structure search method provided by the following embodiments of the present application.
在本申请实施例中,对于服务器11和终端设备12而言,存储器22中存储的软件程序不同,所以服务器11和终端设备12实现的功能不同。关于各设备所执行的功能将结合下面的流程图进行描述。In this embodiment of the present application, for the server 11 and the terminal device 12, the software programs stored in the memory 22 are different, so the functions implemented by the server 11 and the terminal device 12 are different. The functions performed by each device will be described in conjunction with the following flowcharts.
另一种可能的实现方式中,存储器22也可以和处理器21集成在一起。In another possible implementation manner, the memory 22 may also be integrated with the processor 21 .
通信接口23,用于计算装置与其他设备通过通信网络连接,所述通信网络可以是以太网,无线接入网(radio access network,RAN),无线局域网(wireless local area networks,WLAN)等。通信接口23可以包括用于接收数据的接收单元,以及用于发送数据的发送单元。The communication interface 23 is used to connect the computing device with other devices through a communication network, and the communication network can be an Ethernet, a radio access network (RAN), a wireless local area network (WLAN), and the like. The communication interface 23 may include a receiving unit for receiving data, and a transmitting unit for transmitting data.
总线24,可以是工业标准体系结构(Industry Standard Architecture,ISA)总线、外部设备互连(Peripheral Component Interconnect,PCI)总线或扩展工业标准体系结构(Extended Industry Standard Architecture,EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。为便于表示,图2中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。The bus 24 can be an industry standard architecture (Industry Standard Architecture, ISA) bus, a peripheral device interconnect (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus and the like. The bus can be divided into address bus, data bus, control bus and so on. For ease of presentation, only one thick line is used in FIG. 2, but it does not mean that there is only one bus or one type of bus.
需要指出的是,图2中示出的结构并不构成对该计算装置的限定,除图2所示部件之外,该计算装置可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。It should be pointed out that the structure shown in FIG. 2 does not constitute a limitation on the computing device. In addition to the components shown in FIG. 2 , the computing device may include more or less components than those shown in the figure, or combine certain components. some components, or a different arrangement of components.
基于上述计算装置的硬件结构,本申请实施例提供一种神经网络结构搜索方法,下面结合附图对本申请实施例提供的神经网络结构搜索方法进行描述。在本申请实施例中,以服务器执行神经网络结构搜索方法,并确定神经网络模型,终端设备接收并 运行神经网络模型的场景为例,对本申请实施例提供的神经网络结构搜索方法进行介绍。Based on the hardware structure of the above computing device, an embodiment of the present application provides a method for searching a neural network structure. The following describes the method for searching a neural network structure provided by the embodiment of the present application with reference to the accompanying drawings. In the embodiment of the present application, the neural network structure search method provided by the embodiment of the present application is introduced by taking the scenario where the server executes the neural network structure search method, determines the neural network model, and the terminal device receives and runs the neural network model as an example.
当神经网络结构搜索方法应用于图1所示的神经网络结构搜索系统时,如图3所示,神经网络结构搜索方法可以包括以下步骤301-步骤305。When the neural network structure search method is applied to the neural network structure search system shown in FIG. 1 , as shown in FIG. 3 , the neural network structure search method may include the following steps 301 to 305 .
301、服务器根据目标任务获取超网络。301. The server obtains the hypernetwork according to the target task.
其中,目标任务用于指示构建在终端设备上运行的神经网络模型。超网络包括多个网络层,每个网络层包括多个节点,一个网络层的任意两个节点之间通过一个或多个深度学习算子连接。深度学习算子的类型可以为卷积、分离卷积、膨胀卷积、平均池化等类型。且,从该超网络中采样生成的每个神经网络结构可以用于执行目标任务。Among them, the target task is used to instruct the construction of the neural network model running on the terminal device. The super network includes multiple network layers, each network layer includes multiple nodes, and any two nodes in a network layer are connected by one or more deep learning operators. The types of deep learning operators can be convolution, separation convolution, dilated convolution, average pooling, etc. And, each neural network structure sampled from this supernetwork can be used to perform the target task.
通常,每个网络层包括的节点的数量为至少两个。网络层包括的节点的数量越多,对应的深度学习算子越多,所需的计算资源也就越多,输出结果的准确率也就越高。Typically, each network layer includes at least two nodes in number. The greater the number of nodes included in the network layer, the more corresponding deep learning operators, the more computing resources required, and the higher the accuracy of the output results.
服务器在获取到用于指示构建在终端设备上运行的神经网络模型的目标任务的情况下,可以先确定该神经网络模型的性能最优的目标神经网络结构。具体的,服务器可以先根据目标任务获取超网络。When the server obtains the target task for instructing the construction of the neural network model running on the terminal device, the server may first determine the target neural network structure with the best performance of the neural network model. Specifically, the server may first obtain the hypernetwork according to the target task.
可以理解,上述服务器根据目标任务获取超网络的过程为:服务器可以判断本地是否存在历史任务与目标任务相同或相似。若存在,则表明服务器之前根据历史任务构建过超网络,此时服务器可以直接从本地获取之前根据历史任务构建的超网络。若不存在,则表明服务器之前未根据目标任务构建过超网络,此时服务器可以根据目标任务和预设的搜索空间,构建超网络。上述获取超网络的两种方式中,服务器直接从本地获取超网络的方式,能够降低目标神经网络结构搜索的工作量,从而提高搜索效率。It can be understood that the above-mentioned process for the server to acquire the super network according to the target task is as follows: the server can determine whether there is a local historical task that is the same as or similar to the target task. If it exists, it indicates that the server has built a supernetwork based on historical tasks before, and at this time, the server can directly obtain the supernetwork previously built based on historical tasks locally. If it does not exist, it means that the server has not built a super network according to the target task before. At this time, the server can build a super network according to the target task and the preset search space. Among the above two ways of acquiring the super network, the way that the server directly acquires the super network locally can reduce the workload of searching for the target neural network structure, thereby improving the search efficiency.
另外,上述目标任务可以包括神经网络模型的输出类型。例如,目标任务可以是构建在终端设备上运行的人脸识别神经网络模型,用于识别人脸,输出对应的人物名称。又例如,目标任务可以是构建在终端设备上运行的手部姿态估计模型,用于识别图片中人的手部姿态。In addition, the above target task may include the output type of the neural network model. For example, the target task can be to build a face recognition neural network model running on the terminal device, which is used to recognize the face and output the corresponding person name. For another example, the target task may be to build a hand pose estimation model running on a terminal device to recognize the hand pose of a person in a picture.
示例性的,图4为本申请实施例提供的一种超网络的结构示意图。如图4所示,以超网络包括三个网络层为例示出。其中,第一个网络层包括三个节点,这三个节点之间连接使用的深度学习算子包括:3×3标准卷积、5×5标准卷积和跳跃连接算子。第二个网络层包括三个节点,这三个节点之间连接使用的深度学习算子包括:3×3标准卷积、5×5标准卷积和3×3分离卷积。第三个网络层包括四个节点,这四个节点之间连接使用的深度学习算子包括:3×3标准卷积、5×5分离卷积、3×3膨胀卷积和跳跃连接算子。这样,由图4可知,超网络中包括:3×3标准卷积、5×5标准卷积、跳跃连接算子、3×3分离卷积、5×5分离卷积、3×3膨胀卷积,共六个深度学习算子。可以理解,图4所示的每个网络层中,各个节点的连接仅为一种示例性连接,每个网络层的节点之间具体怎么连接,以及两个节点之间连接使用的深度学习算子本申请实施例在此不做限定。Exemplarily, FIG. 4 is a schematic structural diagram of a super network provided by an embodiment of the present application. As shown in FIG. 4 , the super network includes three network layers as an example. Among them, the first network layer includes three nodes, and the deep learning operators used in the connection between the three nodes include: 3×3 standard convolution, 5×5 standard convolution and skip connection operator. The second network layer includes three nodes, and the deep learning operators used in the connection between the three nodes include: 3×3 standard convolution, 5×5 standard convolution and 3×3 split convolution. The third network layer includes four nodes. The deep learning operators used in the connection between these four nodes include: 3×3 standard convolution, 5×5 separation convolution, 3×3 dilated convolution and skip connection operator . In this way, it can be seen from Figure 4 that the super network includes: 3×3 standard convolution, 5×5 standard convolution, skip connection operator, 3×3 separated convolution, 5×5 separated convolution, and 3×3 dilated volume product, a total of six deep learning operators. It can be understood that in each network layer shown in Figure 4, the connection of each node is only an exemplary connection, how to connect the nodes of each network layer specifically, and the deep learning algorithm used for the connection between the two nodes. The embodiments of the present application are not limited herein.
302、服务器获取超网络中的每个深度学习算子在终端设备上运行的时延。302. The server obtains the running delay of each deep learning operator in the super network on the terminal device.
服务器在构建超网络之后,可以向终端设备发送超网络中的每个深度学习算子。终端设备可以运行接收到的每个深度学习算子,并向服务器返回运行每个深度学习算 子的时延。这样,服务器便可以获取到每个深度学习算子在终端设备上运行的时延。After the server constructs the super network, it can send each deep learning operator in the super network to the terminal device. The terminal device can run each received deep learning operator and return the delay of running each deep learning operator to the server. In this way, the server can obtain the running delay of each deep learning operator on the terminal device.
303、服务器根据每个深度学习算子在终端设备上运行的时延,确定超网络的时延损失函数。303. The server determines a delay loss function of the super network according to the delay of each deep learning operator running on the terminal device.
服务器在获取到每个深度学习算子在终端设备上运行的时延之后,可以根据这些时延确定整个超网络的时延损失函数。具体可以参考下述步骤303A-303C的描述。After obtaining the delay of each deep learning operator running on the terminal device, the server can determine the delay loss function of the entire super network according to these delays. For details, refer to the description of steps 303A-303C below.
304、服务器对超网络执行训练操作,并根据时延损失函数和在训练过程中获取到的网络损失函数,更新超网络的模型参数,直到更新后的超网络满足目标任务在终端设备上运行的条件。304. The server performs a training operation on the hypernetwork, and updates the model parameters of the hypernetwork according to the delay loss function and the network loss function obtained during the training process, until the updated hypernetwork satisfies the target task running on the terminal device. condition.
其中,网络损失函数用于表征超网络的预测输出和数据标签的差异。网络损失函数的输出值越大,表明预测输出和数据标签的差异越大。超网络的训练过程可以理解为尽可能缩小时延损失函数和网络损失函数的输出值的过程。Among them, the network loss function is used to characterize the difference between the prediction output of the super network and the data labels. The larger the output value of the network loss function, the greater the difference between the predicted output and the data label. The training process of the super network can be understood as the process of reducing the output value of the delay loss function and the network loss function as much as possible.
服务器在步骤301中获取到超网络之后,可以对该超网络进行训练,并根据步骤303中确定的时延损失函数,以及训练过程中获得的网络损失函数,更新超网络中的模型参数,直到更新后的超网络满足目标任务在终端设备上运行的条件,停止训练过程。其中,模型参数可以包括多个网络层中每个网络层的架构参数。After the server obtains the super network in step 301, it can train the super network, and update the model parameters in the super network according to the delay loss function determined in step 303 and the network loss function obtained in the training process, until The updated supernetwork satisfies the conditions for the target task to run on the terminal device and stops the training process. The model parameters may include architectural parameters of each of the multiple network layers.
可以理解,上述条件可以包括:准确率要求和时延要求。例如,该条件可以包括运行超网络得到的输出结果的准确率达到预设百分比,运行超网络的时延小于预设时间值等。该条件是预先针对目标任务和终端设备的硬件结构提出的。It can be understood that the above conditions may include: accuracy requirements and delay requirements. For example, the condition may include that the accuracy of the output result obtained by running the super network reaches a preset percentage, the delay of running the super network is less than a preset time value, and the like. This condition is proposed in advance for the target task and the hardware structure of the terminal device.
305、服务器根据更新后的每个网络层的架构参数,确定目标神经网络结构。305. The server determines the target neural network structure according to the updated architecture parameters of each network layer.
其中,目标神经网络结构为性能最优的网络结构。Among them, the target neural network structure is the network structure with the best performance.
在具体的实现中,每个网络层的架构参数可以包括该网络层的所有深度学习算子中每个深度学习算子的连接权重。在该情况下,服务器根据更新后的每个网络层的架构参数,确定目标神经网络结构的过程可以为:服务器先获取更新后的每个网络层的架构参数中,数值满足预设条件的连接权重,然后根据获取到的所有连接权重,确定目标神经网络结构。In a specific implementation, the architectural parameters of each network layer may include the connection weight of each deep learning operator in all deep learning operators of the network layer. In this case, the server determines the target neural network structure according to the updated architecture parameters of each network layer. weight, and then determine the target neural network structure according to all the obtained connection weights.
可以理解,上述预设条件可以有多种实现方式。It can be understood that the above-mentioned preset conditions may be implemented in various manners.
在一种可能的实现方式中,预设条件可以为:每个网络层中连接权重的预设数量。满足预设条件的连接权重为:将该网络层中的所有连接权重按照从大到小的顺序排序后的前预设数量的权重。不同网络层的连接权重的预设数量可以相同,也可以不同。In a possible implementation manner, the preset condition may be: a preset number of connection weights in each network layer. The connection weights that satisfy the preset conditions are: the weights of the first preset number of all connection weights in the network layer sorted in descending order. The preset number of connection weights for different network layers can be the same or different.
在另一种可能的实现方式中,预设条件可以为:大于预设权重值的连接权重。这样,服务器可以从每个网络层的架构参数中,获取大于预设权重值的连接权重,并根据所有获取的连接权重确定目标神经网络结构。当然,预设条件也可以为针对每个网络层,均设置一个对应的预设权重值。不同的网络层对应的预设权重值可以相同,也可以不同。In another possible implementation manner, the preset condition may be: a connection weight greater than a preset weight value. In this way, the server can obtain connection weights greater than the preset weight value from the architectural parameters of each network layer, and determine the target neural network structure according to all the obtained connection weights. Of course, the preset condition may also be to set a corresponding preset weight value for each network layer. The preset weight values corresponding to different network layers may be the same or different.
由于现有技术中是根据每个网络层的架构参数中,最大的一个连接权重确定目标神经网络结构的,而本申请中是根据每个网络层的架构参数中,数值满足预设条件的连接权重确定目标神经网络结构的。本申请中并未限定每个网络层的连接权重的数量。当从每个网络层获取的连接权重的数量为多个时,与现有技术中的一个相比,保留的连接权重的数量越多,得到的目标神经网络结构越稳定,从而根据该目标神经网络结 构确定的神经网络模型的任务执行效果越好。In the prior art, the target neural network structure is determined according to the largest connection weight among the architectural parameters of each network layer, while in the present application, the connections whose values satisfy the preset conditions are based on the architectural parameters of each network layer. The weights determine the target neural network structure. The number of connection weights of each network layer is not limited in this application. When the number of connection weights obtained from each network layer is multiple, compared with one in the prior art, the larger the number of reserved connection weights, the more stable the obtained target neural network structure, so that according to the target neural network The task execution effect of the neural network model determined by the network structure is better.
本申请实施例提供的神经网络结构搜索方法,通过根据目标任务获取超网络,并根据超网络中的每个深度学习算子在电子设备上运行的时延,确定超网络的时延损失函数。在训练该超网络的过程中,根据时延损失函数和网络损失函数,更新超网络的模型参数,直到更新后的超网络满足目标任务在电子设备上运行的条件,最后根据更新后的每个网络层的架构参数,确定目标神经网络结构。这样,根据目标任务获取的超网络是一个包括多个网络层,每个网络层包括多个节点,任意两个节点之间通过深度学习算子连接的结构,该超网络包含所有可能用于执行目标任务的子网络。本申请实施例通过训练超网络,更新超网络的模型参数,该模型参数包括每个网络层的架构参数,直到更新后的超网络满足条件为止,便可以根据更新后的每个网络层的架构参数确定目标神经网络结构,即确定出性能最优的神经网络结构,与现有技术中的训练大量的子网络才能得到目标神经网络结构相比,由于本申请实施例仅需要训练超网络便可以得到目标神经网络结构,因此节省了大量的计算资源,缩短了搜索时间,提高了搜索效率。且,由于在更新模型参数时参考的是时延损失函数,该时延损失函数是通过根据每个深度学习算子在电子设备上运行的真实时延得到的,能够实现在确定目标神经网络结构时,保证理论时延和真实时延的一致。The neural network structure search method provided by the embodiment of the present application obtains the super network according to the target task, and determines the delay loss function of the super network according to the running delay of each deep learning operator in the super network on the electronic device. In the process of training the super network, the model parameters of the super network are updated according to the delay loss function and the network loss function until the updated super network satisfies the conditions for the target task to run on the electronic device, and finally according to the updated The architectural parameters of the network layer determine the target neural network structure. In this way, the supernetwork obtained according to the target task is a structure that includes multiple network layers, each network layer includes multiple nodes, and any two nodes are connected by deep learning operators. Subnet of the target task. The embodiment of the present application updates the model parameters of the super network by training the super network. The model parameters include the architecture parameters of each network layer. Until the updated super network satisfies the conditions, the updated architecture of each network layer can be The parameters determine the target neural network structure, that is, determine the neural network structure with the best performance. Compared with the training of a large number of sub-networks in the prior art, the target neural network structure can be obtained because the embodiment of the present application only needs to train the super network. The target neural network structure is obtained, so a lot of computing resources are saved, the search time is shortened, and the search efficiency is improved. Moreover, since the delay loss function is referenced when updating the model parameters, the delay loss function is obtained according to the real delay of each deep learning operator running on the electronic device, which can realize the determination of the target neural network structure. When , the theoretical delay and the real delay are guaranteed to be consistent.
例如,假设目标任务为构建在终端设备上运行的手部姿态估计模型,且手部姿态估计模型在终端设备的GPU上运行。那么,在需要一天内确定出目标神经网络结构的情况下,如果采用现有技术的方案确定目标神经网络结构,则可能需要耗费上千个GPU。如果采用本申请实施例提供的神经网络结构搜索方法确定目标神经网络结构,则可能仅需耗费1~2个GPU。由此可知,本申请实施例提供的神经网络结构搜索方法,大大节省了搜索目标神经网络结构所需的计算资源。For example, suppose the target task is to build a hand pose estimation model that runs on a terminal device, and the hand pose estimation model runs on the GPU of the terminal device. Then, if the target neural network structure needs to be determined within one day, if the solution of the prior art is used to determine the target neural network structure, thousands of GPUs may be required. If the target neural network structure is determined by using the neural network structure search method provided in the embodiment of the present application, it may only consume 1-2 GPUs. It can be seen from this that the neural network structure search method provided by the embodiments of the present application greatly saves the computing resources required for searching for the target neural network structure.
可选的,在本申请实施例中,基于图3,如图5所示,上述步骤303具体可以包括以下步骤303A-303C。Optionally, in this embodiment of the present application, based on FIG. 3 , as shown in FIG. 5 , the foregoing step 303 may specifically include the following steps 303A-303C.
303A、服务器根据预存的算子与网络嵌入系数的对应关系,确定每个深度学习算子对应的网络嵌入系数。303A. The server determines the network embedding coefficient corresponding to each deep learning operator according to the pre-stored correspondence between the operator and the network embedding coefficient.
其中,网络嵌入系数的作用是使得利用该网络嵌入系数得到的时延损失函数与网络损失函数的含义保持一致。Among them, the function of the network embedding coefficient is to make the delay loss function obtained by using the network embedding coefficient consistent with the meaning of the network loss function.
303B、服务器确定每个深度学习算子在终端设备上运行的时延与该深度学习算子对应的网络嵌入系数的乘积,并确定所有乘积的和值。303B. The server determines the product of the running delay of each deep learning operator on the terminal device and the network embedding coefficient corresponding to the deep learning operator, and determines the sum of all the products.
服务器在确定出每个深度学习算子对应的网络嵌入系数之后,可以计算每个深度学习算子在终端设备上运行的时延与该深度学习算子对应的网络嵌入系数的乘积,并将所有乘积相加,得到和值。After determining the network embedding coefficient corresponding to each deep learning operator, the server can calculate the product of the running delay of each deep learning operator on the terminal device and the network embedding coefficient corresponding to the deep learning operator, and calculate all the Add the products to get the sum.
在具体的实现中,假设服务器用
Figure PCTCN2021120434-appb-000001
表示超网络包括的多个深度学习算子中,各个深度学习算子在终端设备上运行的时延组成的时延集合。
In a specific implementation, it is assumed that the server uses
Figure PCTCN2021120434-appb-000001
Indicates the delay set composed of the delays of each deep learning operator running on the terminal device among the multiple deep learning operators included in the super network.
Figure PCTCN2021120434-appb-000002
Figure PCTCN2021120434-appb-000002
其中,S表示超网络中的所有深度学习算子组成的集合,operator表示S集合中的深度学习算子,
Figure PCTCN2021120434-appb-000003
表示S集合中第i个深度学习算子在终端设备上的时延。
Among them, S represents the set of all deep learning operators in the super network, and operator represents the deep learning operators in the S set,
Figure PCTCN2021120434-appb-000003
Indicates the delay of the ith deep learning operator in the S set on the terminal device.
那么,服务器可以计算每个深度学习算子在终端设备上运行的时延与该深度学习 算子对应的网络嵌入系数的乘积,并计算所有乘积的和值。其中,该和值满足以下公式:Then, the server can calculate the product of the delay of each deep learning operator running on the terminal device and the network embedding coefficient corresponding to the deep learning operator, and calculate the sum of all products. where the sum value satisfies the following formula:
Figure PCTCN2021120434-appb-000004
Figure PCTCN2021120434-appb-000004
其中,
Figure PCTCN2021120434-appb-000005
表示S集合中第i个深度学习算子在终端设备上的时延,α i表示第i个深度学习算子对应的网络嵌入系数,
Figure PCTCN2021120434-appb-000006
表示S集合中的所有深度学习算子的加权求和结果,即计算每个深度学习算子的时延与该深度学习算子对应的网络嵌入系数的乘积,并计算所有乘积之和。
in,
Figure PCTCN2021120434-appb-000005
represents the delay of the ith deep learning operator in the S set on the terminal device, α i represents the network embedding coefficient corresponding to the ith deep learning operator,
Figure PCTCN2021120434-appb-000006
Represents the weighted summation result of all deep learning operators in the S set, that is, calculates the product of the delay of each deep learning operator and the network embedding coefficient corresponding to the deep learning operator, and calculates the sum of all products.
303C、服务器根据和值和时延一致性系数,确定时延损失函数。303C, the server determines the delay loss function according to the sum value and the delay consistency coefficient.
服务器在得到所有乘积的和值
Figure PCTCN2021120434-appb-000007
之后,可以确定时延损失函数。其中,时延损失函数满足以下公式:
The server is getting the sum of all products
Figure PCTCN2021120434-appb-000007
After that, the delay loss function can be determined. Among them, the delay loss function satisfies the following formula:
Figure PCTCN2021120434-appb-000008
Figure PCTCN2021120434-appb-000008
其中,λ la表示时延一致性系数,loss la表示时延损失函数。 Among them, λ la represents the delay consistency coefficient, and loss la represents the delay loss function.
可以理解,上述λ la是变量,是超网络的多个网络层中,每个网络层的每个深度学习算子的连接权重构成的矩阵,其在训练过程中不断更新。 It can be understood that the above λ la is a variable, which is a matrix formed by the connection weights of each deep learning operator in each network layer in the multiple network layers of the super network, which is continuously updated during the training process.
这样,通过根据每个深度学习算子在电子设备上运行的真实时延,以及每个深度学习算子对应的网络嵌入系数,实现了将离散的深度学习算子对应的时延构建成连续的时延约束函数,从而保证时延一致性。In this way, according to the real time delay of each deep learning operator running on the electronic device and the network embedding coefficient corresponding to each deep learning operator, the time delay corresponding to the discrete deep learning operator is constructed into a continuous one. Delay constraint function to ensure delay consistency.
可选的,在本申请实施例中,基于图5,如图6所示,上述步骤304具体可以包括以下步骤304A-304B。Optionally, in this embodiment of the present application, based on FIG. 5 , as shown in FIG. 6 , the foregoing step 304 may specifically include the following steps 304A-304B.
304A、服务器对超网络执行训练操作,并根据时延损失函数和网络损失函数,确定超网络的整体损失函数。304A. The server performs a training operation on the super network, and determines the overall loss function of the super network according to the delay loss function and the network loss function.
其中,时延损失函数是为了保证目标神经网络结构的时延一致性,网络损失函数是为了保证目标神经网络结构的精度要求,即准确率要求。Among them, the delay loss function is to ensure the delay consistency of the target neural network structure, and the network loss function is to ensure the accuracy requirements of the target neural network structure, that is, the accuracy requirements.
可以理解,为了避免超网络的过拟合,服务器在确定整体损失函数时,还需要考虑到网络正则项。It can be understood that in order to avoid overfitting of the hypernetwork, the server also needs to consider the network regularization term when determining the overall loss function.
具体的,服务器可以确定超网络的整体损失函数。其中,整体损失函数满足以下公式:Specifically, the server can determine the overall loss function of the hypernetwork. Among them, the overall loss function satisfies the following formula:
Figure PCTCN2021120434-appb-000009
Figure PCTCN2021120434-appb-000009
其中,loss la表示时延损失函数,loss mse表示网络损失函数,
Figure PCTCN2021120434-appb-000010
表示网络正则项,loss表示整体损失函数。
Among them, loss la represents the delay loss function, loss mse represents the network loss function,
Figure PCTCN2021120434-appb-000010
Represents the network regular term, and loss represents the overall loss function.
304B、服务器根据体损失函数,更新超网络的模型参数,直到更新后的超网络满足目标任务在终端设备上运行的条件。304B. The server updates the model parameters of the super-network according to the volume loss function, until the updated super-network satisfies the conditions for running the target task on the terminal device.
在具体的实现中,服务器可以根据整体损失函数,确定每个模型参数的梯度信息,该梯度信息用于表示对应的模型参数的调节系数。之后,服务器可以根据每个模型参数的梯度信息,调整该模型参数。In a specific implementation, the server may determine the gradient information of each model parameter according to the overall loss function, where the gradient information is used to represent the adjustment coefficient of the corresponding model parameter. After that, the server can adjust the model parameters according to the gradient information of each model parameter.
在模型参数包括每个网络层的网络参数和架构参数,每个网络层的网络参数包括该网络层的每个深度学习算的权重的情况下,服务器可以先更新每个网络层的网络参数。其中,更新后的网络参数满足以下公式:In the case that the model parameters include network parameters and architecture parameters of each network layer, and the network parameters of each network layer include the weights calculated by each deep learning of the network layer, the server may first update the network parameters of each network layer. Among them, the updated network parameters satisfy the following formula:
Figure PCTCN2021120434-appb-000011
Figure PCTCN2021120434-appb-000011
其中,w表示某个网络层的网络参数中的一个深度学习算子的权重,W N’表示上一次训练后网络参数w的值,
Figure PCTCN2021120434-appb-000012
表示网络参数w的梯度信息,W N表示更新后的网络参数w的值。
Among them, w represents the weight of a deep learning operator in the network parameters of a certain network layer, W N ' represents the value of the network parameter w after the last training,
Figure PCTCN2021120434-appb-000012
Represents the gradient information of the network parameter w, and W N represents the updated value of the network parameter w.
之后,服务器可以更新每个网络层的架构参数。其中,更新后的架构参数满足以下公式:After that, the server can update the architectural parameters of each network layer. Among them, the updated architecture parameters satisfy the following formula:
Figure PCTCN2021120434-appb-000013
Figure PCTCN2021120434-appb-000013
其中,a表示某个网络层的架构参数中的一个深度学习算子的连接权重,W A’表示上一次训练后架构参数a的值,
Figure PCTCN2021120434-appb-000014
表示架构参数a的梯度信息,W A表示更新后的架构参数a的值。
Among them, a represents the connection weight of a deep learning operator in the architecture parameters of a certain network layer, W A ' represents the value of the architecture parameter a after the last training,
Figure PCTCN2021120434-appb-000014
Represents the gradient information of the architecture parameter a, and W A represents the updated value of the architecture parameter a.
这样,通过根据时延损失函数和网络损失函数,更新超网络中的模型参数,保证目标神经网络结构满足时延一致性和网络精度要求。In this way, by updating the model parameters in the hypernetwork according to the delay loss function and the network loss function, it is ensured that the target neural network structure meets the requirements of delay consistency and network accuracy.
上述主要从方法的角度对本申请实施例提供的方案进行了介绍。为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。The solutions provided by the embodiments of the present application have been introduced above mainly from the perspective of methods. In order to realize the above-mentioned functions, it includes corresponding hardware structures and/or software modules for executing each function. Those skilled in the art should easily realize that the present application can be implemented in hardware or in the form of a combination of hardware and computer software, in conjunction with the algorithm steps of the examples described in the embodiments disclosed herein. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.
如图7所示,为本申请实施例提供的一种神经网络结构搜索装置70的结构示意图。神经网络结构搜索装置70用于执行图3、图5、图6中任一附图所示的神经网络结构搜索方法。神经网络结构搜索装置70可以包括获取单元71、确定单元72、训练单元73和更新单元74。As shown in FIG. 7 , it is a schematic structural diagram of a neural network structure search apparatus 70 according to an embodiment of the present application. The neural network structure search device 70 is configured to execute the neural network structure search method shown in any one of FIG. 3 , FIG. 5 , and FIG. 6 . The neural network structure search apparatus 70 may include an acquisition unit 71 , a determination unit 72 , a training unit 73 and an update unit 74 .
获取单元71,用于根据目标任务获取超网络,超网络包括多个网络层,每个网络层包括多个节点,一个网络层的任意两个节点之间通过深度学习算子连接;还用于获取超网络中的每个深度学习算子在电子设备运行的时延。例如,结合图3,获取单元71可以用于执行步骤301、步骤302。确定单元72,用于根据获取单元71获取的每个深度学习算子在电子设备运行的时延,确定超网络的时延损失函数。例如,结合图3,确定单元72可以用于执行步骤303。训练单元73,用于对获取单元71获取的超网络执行训练操作。例如,结合图3,训练单元73可以用于执行步骤304所述的对超网络执行训练操作。更新单元74,用于根据确定单元72确定的时延损失函数和训练单元73训练过程中获取到的网络损失函数,更新超网络的模型参数,直到更新后的超网络满足目标任务在电子设备上运行的条件,模型参数包括多个网络层中每个网络层的架构参数。例如,结合图3,更新单元74可以用于执行步骤304所述的根据时延损失函数和在训练过程中获取到的网络损失函数,更新超网络的模型参数。确定单元72,还用于根据更新单元74更新后的每个网络层的架构参数,确定目标神经网络结构。例如,结合图3,确定单元72可以用于执行步骤305。The obtaining unit 71 is configured to obtain a super network according to the target task, the super network includes a plurality of network layers, each network layer includes a plurality of nodes, and any two nodes of a network layer are connected by a deep learning operator; also used for Obtain the running delay of each deep learning operator in the super network on the electronic device. For example, with reference to FIG. 3 , the obtaining unit 71 may be used to perform steps 301 and 302 . The determining unit 72 is configured to determine the delay loss function of the super network according to the running delay of each deep learning operator obtained by the obtaining unit 71 in the electronic device. For example, in conjunction with FIG. 3 , the determination unit 72 may be used to perform step 303 . The training unit 73 is configured to perform a training operation on the super network obtained by the obtaining unit 71 . For example, in conjunction with FIG. 3 , the training unit 73 may be configured to perform the training operation on the super network described in step 304 . The updating unit 74 is used to update the model parameters of the super network according to the time delay loss function determined by the determining unit 72 and the network loss function obtained during the training process of the training unit 73, until the updated super network satisfies the target task on the electronic device The operating conditions, the model parameters include the architectural parameters of each of the multiple network layers. For example, referring to FIG. 3 , the updating unit 74 may be configured to perform step 304 to update the model parameters of the super network according to the delay loss function and the network loss function obtained in the training process. The determining unit 72 is further configured to determine the target neural network structure according to the architecture parameters of each network layer updated by the updating unit 74 . For example, in conjunction with FIG. 3 , the determination unit 72 may be used to perform step 305 .
可选的,确定单元72,具体用于:根据预存的算子与网络嵌入系数的对应关系,确定每个深度学习算子对应的网络嵌入系数;确定每个深度学习算子在电子设备运行的时延与该深度学习算子对应的网络嵌入系数的乘积,并确定所有乘积的和值;根据 和值和时延一致性系数,确定时延损失函数。Optionally, the determining unit 72 is specifically configured to: determine the network embedding coefficient corresponding to each deep learning operator according to the corresponding relationship between the pre-stored operator and the network embedding coefficient; The product of the delay and the network embedding coefficient corresponding to the deep learning operator is determined, and the sum of all products is determined; the delay loss function is determined according to the sum and the delay consistency coefficient.
可选的,网络层的架构参数包括网络层的每个深度学习算子的连接权重,确定单元72,具体用于:获取更新后的每个网络层的架构参数中,数值满足预设条件的连接权重;根据获取到的所有连接权重,确定目标神经网络结构。Optionally, the architecture parameters of the network layer include the connection weight of each deep learning operator of the network layer, and the determining unit 72 is specifically configured to: obtain the updated architecture parameters of each network layer, the value of which satisfies the preset condition. Connection weights; determine the target neural network structure according to all the obtained connection weights.
可选的,更新单元74,具体用于:根据时延损失函数和网络损失函数,确定超网络的整体损失函数;根据整体损失函数,更新超网络的模型参数。Optionally, the updating unit 74 is specifically configured to: determine the overall loss function of the super network according to the delay loss function and the network loss function; and update the model parameters of the super network according to the overall loss function.
可选的,更新单元74,具体用于:根据整体损失函数,确定每个模型参数的梯度信息,梯度信息用于表示对应的模型参数的调节系数;根据每个模型参数的梯度信息,调整该模型参数。Optionally, the updating unit 74 is specifically configured to: determine gradient information of each model parameter according to the overall loss function, and the gradient information is used to represent the adjustment coefficient of the corresponding model parameter; adjust the gradient information of each model parameter according to the gradient information of each model parameter. model parameters.
当然,本申请实施例提供的神经网络结构搜索装置70包括但不限于上述模块。Certainly, the neural network structure search apparatus 70 provided in the embodiment of the present application includes but is not limited to the above-mentioned modules.
在实际实现时,获取单元71、确定单元72、训练单元73和更新单元74可以由图2所示的处理器来实现。其具体的执行过程可参考图3、图5或图6所示的神经网络结构搜索方法部分的描述,这里不再赘述。In actual implementation, the acquiring unit 71 , the determining unit 72 , the training unit 73 and the updating unit 74 may be implemented by the processor shown in FIG. 2 . For the specific execution process, reference may be made to the description of the part of the neural network structure search method shown in FIG. 3 , FIG. 5 or FIG. 6 , which will not be repeated here.
本申请另一实施例还提供一种计算机可读存储介质,该计算机可读存储介质中存储有计算机指令,当计算机指令在神经网络结构搜索装置上运行时,使得神经网络结构搜索装置执行上述方法实施例所示的方法流程中神经网络结构搜索装置执行的各个步骤。Another embodiment of the present application further provides a computer-readable storage medium, where computer instructions are stored in the computer-readable storage medium, and when the computer instructions are executed on the neural network structure search device, the neural network structure search device is made to execute the above method. Each step performed by the neural network structure search apparatus in the method flow shown in the embodiment.
本申请另一实施例还提供一种芯片系统,该芯片系统应用于神经网络结构搜索装置。芯片系统包括一个或多个接口电路,以及一个或多个处理器。接口电路和处理器通过线路互联。接口电路用于从神经网络结构搜索装置的存储器接收信号,并向处理器发送信号,信号包括存储器中存储的计算机指令。当处理器执行计算机指令时,神经网络结构搜索装置执行上述方法实施例所示的方法流程中神经网络结构搜索装置执行的各个步骤。Another embodiment of the present application further provides a chip system, which is applied to a neural network structure search apparatus. A chip system includes one or more interface circuits, and one or more processors. The interface circuit and the processor are interconnected by wires. The interface circuit is configured to receive signals from the memory of the neural network structure search device and send signals to the processor, the signals including computer instructions stored in the memory. When the processor executes the computer instructions, the neural network structure search apparatus executes each step performed by the neural network structure search apparatus in the method flow shown in the above method embodiment.
在本申请另一实施例中,还提供一种计算机程序产品,该计算机程序产品包括计算机指令,当计算机指令在神经网络结构搜索装置上运行时,使得神经网络结构搜索装置执行上述方法实施例所示的方法流程中神经网络结构搜索装置执行的各个步骤。In another embodiment of the present application, a computer program product is also provided. The computer program product includes computer instructions. When the computer instructions are run on the neural network structure search device, the neural network structure search device is made to perform the above method embodiments. Each step performed by the neural network structure search apparatus in the method flow shown.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件程序实现时,可以全部或部分地以计算机程序产品的形式来实现。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机执行指令时,全部或部分地产生按照本申请实施例的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或者数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可以用介质集成的服务器、数据中心等数据存储设备。可用介质可以是磁性介质(例如,软盘、硬盘、磁带),光介质(例如,DVD)、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。In the above-mentioned embodiments, it may be implemented in whole or in part by software, hardware, firmware or any combination thereof. When implemented using a software program, it can be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer-executed instructions are loaded and executed on the computer, the flow or function according to the embodiments of the present application is generated in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device. Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website site, computer, server, or data center over a wire (e.g. coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (eg infrared, wireless, microwave, etc.) means to transmit to another website site, computer, server or data center. Computer-readable storage media can be any available media that can be accessed by a computer or data storage devices including one or more servers, data centers, etc., that can be integrated with the media. Useful media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media (eg, solid state disks (SSDs)), and the like.
以上所述,仅为本申请的具体实施方式。熟悉本技术领域的技术人员根据本申请提供的具体实施方式,可想到变化或替换,都应涵盖在本申请的保护范围之内。The above descriptions are merely specific embodiments of the present application. Those skilled in the art can think of changes or substitutions based on the specific embodiments provided by the present application, which should all fall within the protection scope of the present application.

Claims (12)

  1. 一种神经网络结构搜索方法,其特征在于,包括:A neural network structure search method, comprising:
    根据目标任务获取超网络,所述超网络包括多个网络层,每个网络层包括多个节点,一个网络层的任意两个节点之间通过深度学习算子连接;Obtain a super network according to the target task, the super network includes a plurality of network layers, each network layer includes a plurality of nodes, and any two nodes of a network layer are connected by a deep learning operator;
    获取所述超网络中的每个深度学习算子在电子设备运行的时延;Obtain the running delay of each deep learning operator in the super network in the electronic device;
    根据每个深度学习算子在所述电子设备运行的时延,确定所述超网络的时延损失函数;Determine the delay loss function of the super network according to the delay of each deep learning operator running on the electronic device;
    对所述超网络执行训练操作,根据所述时延损失函数和所述训练过程中获取到的网络损失函数,更新所述超网络的模型参数,直到更新后的超网络满足所述目标任务在所述电子设备上运行的条件;所述模型参数包括所述多个网络层中每个网络层的架构参数;Perform a training operation on the super network, and update the model parameters of the super network according to the delay loss function and the network loss function obtained in the training process, until the updated super network satisfies the target task at Conditions for operation on the electronic device; the model parameters include architectural parameters of each network layer in the plurality of network layers;
    根据更新后的每个网络层的架构参数,确定目标神经网络结构。According to the updated architecture parameters of each network layer, the target neural network structure is determined.
  2. 根据权利要求1所述的神经网络结构搜索方法,其特征在于,所述根据每个深度学习算子在所述电子设备运行的时延,确定所述超网络的时延损失函数,包括:The method for searching a neural network structure according to claim 1, wherein the determining the delay loss function of the super network according to the running delay of each deep learning operator in the electronic device, comprises:
    根据预存的算子与网络嵌入系数的对应关系,确定每个深度学习算子对应的网络嵌入系数;Determine the network embedding coefficient corresponding to each deep learning operator according to the correspondence between the pre-stored operator and the network embedding coefficient;
    确定每个深度学习算子在所述电子设备运行的时延与该深度学习算子对应的网络嵌入系数的乘积,并确定所有乘积的和值;Determine the product of the delay of each deep learning operator running on the electronic device and the network embedding coefficient corresponding to the deep learning operator, and determine the sum of all products;
    根据所述和值和时延一致性系数,确定所述时延损失函数。The delay loss function is determined according to the sum and the delay consistency coefficient.
  3. 根据权利要求1或2所述的神经网络结构搜索方法,其特征在于,所述网络层的架构参数包括所述网络层的每个深度学习算子的连接权重,所述根据更新后的每个网络层的架构参数,确定目标神经网络结构,包括:The neural network structure search method according to claim 1 or 2, wherein the architectural parameters of the network layer include a connection weight of each deep learning operator of the network layer, and the The architectural parameters of the network layer determine the target neural network structure, including:
    获取更新后的每个网络层的架构参数中,数值满足预设条件的连接权重;Obtain the connection weights whose values satisfy the preset conditions in the updated architecture parameters of each network layer;
    根据获取到的所有连接权重,确定所述目标神经网络结构。The target neural network structure is determined according to all the obtained connection weights.
  4. 根据权利要求1-3中任一项所述的神经网络结构搜索方法,其特征在于,所述根据所述时延损失函数和所述训练过程中获取到的网络损失函数,更新所述超网络的模型参数,包括:The neural network structure search method according to any one of claims 1-3, wherein the super network is updated according to the delay loss function and the network loss function obtained in the training process model parameters, including:
    根据所述时延损失函数和所述网络损失函数,确定所述超网络的整体损失函数;determining the overall loss function of the super-network according to the delay loss function and the network loss function;
    根据所述整体损失函数,更新所述超网络的模型参数。According to the overall loss function, the model parameters of the super-network are updated.
  5. 根据权利要求4所述的神经网络结构搜索方法,其特征在于,所述根据所述整体损失函数,更新所述超网络的模型参数,包括:The neural network structure search method according to claim 4, wherein the updating the model parameters of the super network according to the overall loss function comprises:
    根据所述整体损失函数,确定每个模型参数的梯度信息,所述梯度信息用于表示对应的模型参数的调节系数;According to the overall loss function, determine the gradient information of each model parameter, where the gradient information is used to represent the adjustment coefficient of the corresponding model parameter;
    根据每个模型参数的梯度信息,调整该模型参数。According to the gradient information of each model parameter, the model parameter is adjusted.
  6. 一种神经网络结构搜索装置,其特征在于,包括:A neural network structure search device, characterized in that it includes:
    获取单元,用于根据目标任务获取超网络,所述超网络包括多个网络层,每个网络层包括多个节点,一个网络层的任意两个节点之间通过深度学习算子连接;获取所述超网络中的每个深度学习算子在电子设备运行的时延;The acquiring unit is configured to acquire a super network according to the target task, the super network includes a plurality of network layers, each network layer includes a plurality of nodes, and any two nodes of a network layer are connected by a deep learning operator; The delay of each deep learning operator in the super network running on the electronic device;
    确定单元,用于根据所述获取单元获取的每个深度学习算子在所述电子设备运行 的时延,确定所述超网络的时延损失函数;A determining unit, for determining the time delay loss function of the super network according to the time delay that each deep learning operator obtained by the obtaining unit runs in the electronic device;
    训练单元,用于对所述获取单元获取的所述超网络执行训练操作;A training unit, configured to perform a training operation on the hypernetwork acquired by the acquiring unit;
    更新单元,用于根据所述确定单元确定的所述时延损失函数和所述训练单元所述训练过程中获取到的网络损失函数,更新所述超网络的模型参数,直到更新后的超网络满足所述目标任务在所述电子设备上运行的条件;所述模型参数包括所述多个网络层中每个网络层的架构参数;an update unit, configured to update the model parameters of the super network according to the delay loss function determined by the determination unit and the network loss function obtained during the training process of the training unit until the updated super network Satisfy the conditions for the target task to run on the electronic device; the model parameters include architectural parameters of each network layer in the plurality of network layers;
    所述确定单元,还用于根据所述更新单元更新后的每个网络层的架构参数,确定目标神经网络结构。The determining unit is further configured to determine the target neural network structure according to the architecture parameters of each network layer updated by the updating unit.
  7. 根据权利要求6所述的神经网络结构搜索装置,其特征在于,所述确定单元,具体用于:The neural network structure search device according to claim 6, wherein the determining unit is specifically used for:
    根据预存的算子与网络嵌入系数的对应关系,确定每个深度学习算子对应的网络嵌入系数;Determine the network embedding coefficient corresponding to each deep learning operator according to the correspondence between the pre-stored operator and the network embedding coefficient;
    确定每个深度学习算子在所述电子设备运行的时延与该深度学习算子对应的网络嵌入系数的乘积,并确定所有乘积的和值;Determine the product of the delay of each deep learning operator running on the electronic device and the network embedding coefficient corresponding to the deep learning operator, and determine the sum of all products;
    根据所述和值和时延一致性系数,确定所述时延损失函数。The delay loss function is determined according to the sum and the delay consistency coefficient.
  8. 根据权利要求6或7所述的神经网络结构搜索装置,其特征在于,所述网络层的架构参数包括所述网络层的每个深度学习算子的连接权重,所述确定单元,具体用于:The neural network structure search apparatus according to claim 6 or 7, wherein the architecture parameter of the network layer includes a connection weight of each deep learning operator of the network layer, and the determining unit is specifically used for :
    获取更新后的每个网络层的架构参数中,数值满足预设条件的连接权重;Obtain the connection weights whose values satisfy the preset conditions in the updated architecture parameters of each network layer;
    根据获取到的所有连接权重,确定所述目标神经网络结构。The target neural network structure is determined according to all the obtained connection weights.
  9. 根据权利要求6-8中任一项所述的神经网络结构搜索装置,其特征在于,所述更新单元,具体用于:The neural network structure search device according to any one of claims 6-8, wherein the updating unit is specifically used for:
    根据所述时延损失函数和所述网络损失函数,确定所述超网络的整体损失函数;determining the overall loss function of the super-network according to the delay loss function and the network loss function;
    根据所述整体损失函数,更新所述超网络的模型参数。According to the overall loss function, the model parameters of the super-network are updated.
  10. 根据权利要求9所述的神经网络结构搜索装置,其特征在于,所述更新单元,具体用于:The neural network structure search device according to claim 9, wherein the updating unit is specifically used for:
    根据所述整体损失函数,确定每个模型参数的梯度信息,所述梯度信息用于表示对应的模型参数的调节系数;According to the overall loss function, determine the gradient information of each model parameter, where the gradient information is used to represent the adjustment coefficient of the corresponding model parameter;
    根据每个模型参数的梯度信息,调整该模型参数。According to the gradient information of each model parameter, the model parameter is adjusted.
  11. 一种神经网络结构搜索装置,其特征在于,所述神经网络结构搜索装置包括存储器和处理器;所述存储器和所述处理器耦合;所述存储器用于存储计算机程序代码,所述计算机程序代码包括计算机指令;当所述处理器执行所述计算机指令时,所述神经网络结构搜索装置执行如权利要求1-5中任意一项所述的神经网络结构搜索方法。A neural network structure search device, characterized in that the neural network structure search device includes a memory and a processor; the memory and the processor are coupled; the memory is used for storing computer program codes, the computer program codes Computer instructions are included; when the processor executes the computer instructions, the neural network structure search apparatus executes the neural network structure search method according to any one of claims 1-5.
  12. 一种计算机可读存储介质,其特征在于,包括计算机指令,当所述计算机指令在神经网络结构搜索装置上运行时,使得所述神经网络结构搜索装置执行如权利要求1-5中任意一项所述的神经网络结构搜索方法。A computer-readable storage medium, characterized by comprising computer instructions, when the computer instructions are executed on the neural network structure search device, the neural network structure search device is made to perform any one of claims 1-5. The described neural network structure search method.
PCT/CN2021/120434 2020-09-28 2021-09-24 Neural architecture search method and apparatus WO2022063247A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011043055.2A CN114330699A (en) 2020-09-28 2020-09-28 Neural network structure searching method and device
CN202011043055.2 2020-09-28

Publications (1)

Publication Number Publication Date
WO2022063247A1 true WO2022063247A1 (en) 2022-03-31

Family

ID=80844966

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/120434 WO2022063247A1 (en) 2020-09-28 2021-09-24 Neural architecture search method and apparatus

Country Status (2)

Country Link
CN (1) CN114330699A (en)
WO (1) WO2022063247A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111667056A (en) * 2020-06-05 2020-09-15 北京百度网讯科技有限公司 Method and apparatus for searching model structure
CN114700957A (en) * 2022-05-26 2022-07-05 北京云迹科技股份有限公司 Robot control method and device with low computational power requirement of model
CN114972334A (en) * 2022-07-19 2022-08-30 杭州因推科技有限公司 Pipe flaw detection method, device and medium
CN115358379A (en) * 2022-10-20 2022-11-18 腾讯科技(深圳)有限公司 Neural network processing method, neural network processing device, information processing method, information processing device and computer equipment
CN115829017A (en) * 2023-02-20 2023-03-21 之江实验室 Data processing method, device, medium and equipment based on core particles

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115017377B (en) * 2022-08-05 2022-11-08 深圳比特微电子科技有限公司 Method, device and computing equipment for searching target model
CN116051964B (en) * 2023-03-30 2023-06-27 阿里巴巴(中国)有限公司 Deep learning network determining method, image classifying method and device
CN116684480B (en) * 2023-07-28 2023-10-31 支付宝(杭州)信息技术有限公司 Method and device for determining information push model and method and device for information push

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111325338A (en) * 2020-02-12 2020-06-23 暗物智能科技(广州)有限公司 Neural network structure evaluation model construction and neural network structure search method
CN111353601A (en) * 2020-02-25 2020-06-30 北京百度网讯科技有限公司 Method and apparatus for predicting delay of model structure
CN111428854A (en) * 2020-01-17 2020-07-17 华为技术有限公司 Structure searching method and structure searching device
WO2020188658A1 (en) * 2019-03-15 2020-09-24 三菱電機株式会社 Architecture estimation device, architecture estimation method, and architecture estimation program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020188658A1 (en) * 2019-03-15 2020-09-24 三菱電機株式会社 Architecture estimation device, architecture estimation method, and architecture estimation program
CN111428854A (en) * 2020-01-17 2020-07-17 华为技术有限公司 Structure searching method and structure searching device
CN111325338A (en) * 2020-02-12 2020-06-23 暗物智能科技(广州)有限公司 Neural network structure evaluation model construction and neural network structure search method
CN111353601A (en) * 2020-02-25 2020-06-30 北京百度网讯科技有限公司 Method and apparatus for predicting delay of model structure

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111667056A (en) * 2020-06-05 2020-09-15 北京百度网讯科技有限公司 Method and apparatus for searching model structure
CN111667056B (en) * 2020-06-05 2023-09-26 北京百度网讯科技有限公司 Method and apparatus for searching model structures
CN114700957A (en) * 2022-05-26 2022-07-05 北京云迹科技股份有限公司 Robot control method and device with low computational power requirement of model
CN114700957B (en) * 2022-05-26 2022-08-26 北京云迹科技股份有限公司 Robot control method and device with low computational power requirement of model
CN114972334A (en) * 2022-07-19 2022-08-30 杭州因推科技有限公司 Pipe flaw detection method, device and medium
CN114972334B (en) * 2022-07-19 2023-09-15 杭州因推科技有限公司 Pipe flaw detection method, device and medium
CN115358379A (en) * 2022-10-20 2022-11-18 腾讯科技(深圳)有限公司 Neural network processing method, neural network processing device, information processing method, information processing device and computer equipment
CN115829017A (en) * 2023-02-20 2023-03-21 之江实验室 Data processing method, device, medium and equipment based on core particles
CN115829017B (en) * 2023-02-20 2023-05-23 之江实验室 Method, device, medium and equipment for processing data based on core particles

Also Published As

Publication number Publication date
CN114330699A (en) 2022-04-12

Similar Documents

Publication Publication Date Title
WO2022063247A1 (en) Neural architecture search method and apparatus
WO2021190127A1 (en) Data processing method and data processing device
WO2022083624A1 (en) Model acquisition method, and device
WO2022022274A1 (en) Model training method and apparatus
WO2018099084A1 (en) Method, device, chip and system for training neural network model
CN112668128A (en) Method and device for selecting terminal equipment nodes in federated learning system
JP7287397B2 (en) Information processing method, information processing apparatus, and information processing program
CN114065863B (en) Federal learning method, apparatus, system, electronic device and storage medium
WO2022088082A1 (en) Task processing method, apparatus and device based on defect detection, and storage medium
CN113505883A (en) Neural network training method and device
CN114050975B (en) Heterogeneous multi-node interconnection topology generation method and storage medium
CN113778691B (en) Task migration decision method, device and system
CN114595049A (en) Cloud-edge cooperative task scheduling method and device
CN113553138A (en) Cloud resource scheduling method and device
CN113869496A (en) Acquisition method of neural network, data processing method and related equipment
Miao et al. Adaptive DNN partition in edge computing environments
WO2022100607A1 (en) Method for determining neural network structure and apparatus thereof
Shimonishi et al. Energy optimization of distributed video processing system using genetic algorithm with bayesian attractor model
WO2023197857A1 (en) Model partitioning method and related device thereof
CN115412401B (en) Method and device for training virtual network embedding model and virtual network embedding
WO2023164933A1 (en) Building modeling method and related apparatus
WO2022052647A1 (en) Data processing method, neural network training method, and related device
Ahn et al. Scissionlite: Accelerating distributed deep neural networks using transfer layer
WO2022021199A1 (en) Neural network model construction method and device therefor
CN115174681B (en) Method, equipment and storage medium for scheduling edge computing service request

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21871616

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21871616

Country of ref document: EP

Kind code of ref document: A1