WO2021218517A1 - Procédé permettant d'acquérir un modèle de réseau neuronal et procédé et appareil de traitement d'image - Google Patents

Procédé permettant d'acquérir un modèle de réseau neuronal et procédé et appareil de traitement d'image Download PDF

Info

Publication number
WO2021218517A1
WO2021218517A1 PCT/CN2021/083371 CN2021083371W WO2021218517A1 WO 2021218517 A1 WO2021218517 A1 WO 2021218517A1 CN 2021083371 W CN2021083371 W CN 2021083371W WO 2021218517 A1 WO2021218517 A1 WO 2021218517A1
Authority
WO
WIPO (PCT)
Prior art keywords
network model
sub
super
models
target
Prior art date
Application number
PCT/CN2021/083371
Other languages
English (en)
Chinese (zh)
Inventor
田沈晶
黄泽毅
徐凯翔
唐少华
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021218517A1 publication Critical patent/WO2021218517A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Definitions

  • This application relates to the field of artificial intelligence, and more specifically, to a method for obtaining a neural network model, an image processing method, and an apparatus.
  • Artificial intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain the best results.
  • artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can react in a similar way to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • Research in the field of artificial intelligence includes robotics, natural language processing, computer vision, decision-making and reasoning, human-computer interaction, recommendation and search, and basic AI theories.
  • neural network models for example, convolutional neural network models
  • neural network models have also been used in the processing and analysis of various media signals such as images, videos, and voices.
  • image recognition as an example, the deep neural network model leads the traditional computer vision method with the advantage of rolling level.
  • training a good deep neural network model requires a lot of expert experience.
  • automatic search of neural network models with the help of automated machine learning (AutoML) technology has gradually become a hot spot in the field of computer vision. AutoML can get a better neural network model than manual design.
  • the present application provides a method for obtaining a neural network model, an image processing method and a device, which reduce training costs and improve the performance of the neural network model in the process of obtaining the required neural network model.
  • a method for obtaining a neural network model includes: obtaining a pre-trained super network model, where the pre-trained super network model is trained based on a source data set; and obtaining a target data set corresponding to the target data set
  • the task of is the same as the task corresponding to the source data set; based on the target data set, perform transfer learning on the pre-trained super network model to obtain the super network model after transfer learning; search for the sub network model in the super network model after transfer learning to obtain Target neural network model.
  • the source data set can be a data set with a large amount of data, which can ensure that the super network model is fully trained, and a more accurate super network model can be obtained.
  • the source data set may be a data set related to the task to be performed by the target neural network model. That is to say, the tasks performed by the sub-network model in the pre-trained super network model and the target neural network model are consistent. For example, both are used for image classification; or both are used for image segmentation; or both are used for target detection.
  • the source data set may be the public data set ImageNet.
  • the target data set can be a data set input by the user, or a data set obtained from other devices.
  • Performing migration learning on the pre-trained super network model based on the target data set can fine-tune the pre-trained super network model based on the target data set.
  • Performing migration learning on the pre-trained super network model refers to transferring the weight of the pre-trained super network model.
  • the target neural network model may refer to a neural network model whose performance index meets the target performance index. That is to say, the target sub-network model can be searched in the super network model after transfer learning, and the target neural network model can be determined according to the target sub-network model.
  • the target sub-network model may be a sub-network model whose performance index meets the target performance index.
  • the target sub-network model can be one sub-network model or multiple sub-network models.
  • the performance index of the sub-network model may include the inference accuracy of the sub-network model, the hardware cost of the sub-network model, or the inference time length of the sub-network model, etc.
  • Target performance indicators can include target accuracy, target cost, or target reasoning time.
  • searching for the sub-network model in the super-network model after transfer learning may be to search for the sub-network model in the super-network model after transfer learning through a reinforcement learning algorithm to obtain the target neural network model.
  • the neural network model that meets the user's needs can be obtained by searching the sub-network model in the super network model, and the target data set can be adapted to meet the user's needs, for example , To meet the user's cost/precision requirements.
  • the weight of the super network model is shared between different data sets.
  • the source data set and the target data set are data sets related to the same task, which can realize the efficient migration learning of AutoML, and only the weight of the super network model is used during migration. Fine-tuning does not need to adjust the structure of the super network model, which can greatly improve the migration efficiency of AutoML, reduce the time required for training by at least an order of magnitude, and even reach the training time of ordinary neural network models.
  • the migration time of the super network model provided in the embodiment of the present application is close to the migration time of the ordinary neural network model. That is to say, compared to the method of obtaining the target neural network model through the transfer learning of the ordinary neural network model, the method of obtaining the neural network model in the embodiment of the present application can meet the requirements of users' refinement under the same training time. Cost/precision requirements. Under the same cost, the target neural network model with higher accuracy is obtained.
  • the pre-trained super network model is obtained through progressive contraction training.
  • training the super-network model through the progressive shrinkage method may include: first training the largest sub-network model, and then gradually training the sub-network model with a variable convolution kernel, the sub-network model with a variable number of layers and the sub-network with a variable number of channels Model.
  • the largest sub-network model refers to a sub-network model with the largest convolution kernel (kernel), the largest number of layers (depth), and the largest number of channels (width) in the super network model.
  • a sub-network model with a variable convolution kernel, a sub-network model with a variable number of layers, and a sub-network model with a variable number of channels can be trained by performing knowledge distillation on the largest sub-network model.
  • the progressive shrinking algorithm training reduces the mutual influence of the sub-network models of different sizes during the training process, and the obtained super-network model can support a variety of different architecture settings.
  • a variety of different architecture settings include different numbers of layers, different numbers of channels, and sub-network models of different convolution kernel sizes.
  • the sub-network model does not need to be retrained, and the sub-network model can also be guaranteed.
  • the accuracy meets the requirements of pre-training. There is no need to train each sub-network model independently during the training process of the super network model, and the sub-network model in the super network model can achieve an accuracy similar to that of the independently trained sub-network model.
  • migration learning is performed on the pre-trained super network model based on the target data set to obtain the super network model after migration learning, including: from the pre-trained super network Select a sub-network model in the model, calculate the weight gradient of the sub-network model based on the target data set, update the weight of the sub-network model based on the weight gradient of the sub-network model, and obtain an updated sub-network model, An updated super network model is obtained based on the updated sub-network model; the above steps are repeated until the updated super network model meets the termination condition to obtain the super network model after migration learning; wherein, the termination condition includes At least one of the following: the number of repetitions is greater than or equal to the first iteration number; the reasoning accuracy of the updated super network model is greater than or equal to the first reasoning accuracy.
  • the super-network model is migrated through a single-pass algorithm, and the sub-network model can be sampled uniformly and trained to improve the training effect.
  • the memory space can be reduced and efficient training can be achieved.
  • the performing migration learning on the pre-trained super network model based on the target data set to obtain the super network model after migration learning includes: network model pre-trained over said selected sub-network model N b, N b of the network model is calculated based on the target data sub-set of weight gradient, the gradient is updated based on the weight of the sub-network model N b N b weight of the sub The weight of the network model obtains the updated super network model, and N b is a positive integer;
  • the termination condition includes at least one of the following: the number of repetitions is greater than or equal to the first iteration number; The reasoning accuracy of the updated super network model is greater than or equal to the first reasoning accuracy.
  • N b sub-network models can also be understood as selecting N b sub-network models.
  • Updating the weights of the N b sub-network models is also updating the weights of the super-network models.
  • updating the weight of the super network model may include: updating the weight of the super network model by subtracting the weight gradient of the N b sub-network models from the weight of the current super network model.
  • updating the weight of the super network model may include: updating the weight of the super network model by subtracting the product of the weight gradient of the N b sub-network models and the learning rate from the weight of the current super network model.
  • the reasoning accuracy of the updated super network model may be the reasoning accuracy of at least one sub-network model in the super network model.
  • the sub-network model is searched in the migrated super network model to obtain the target neural network model, including:
  • Step 1 Determine n first sub-network models according to the super network model after transfer learning, where n is an integer greater than 1;
  • Step 2 Adjust the structure of the n first sub-network models to obtain n second sub-network models
  • Step 3 Select n third sub-network models from the n first sub-network models and the n second sub-network models, and use the n third sub-network models as the n in step two The first sub-network model;
  • the search termination condition includes at least one of the following: the number of repetitions is greater than or equal to the second iteration number, or the n
  • the inference accuracy of at least p third sub-network models in the third sub-network model is greater than or equal to the target accuracy;
  • the target neural network model is determined according to the n third sub-network models.
  • n sub-network models from the super network model after transfer learning, and the n sub-network models are the n first sub-network models.
  • the n first sub-network models can be regarded as a population.
  • adjusting the structure of the n first sub-network models may be adjusting the structure of the first sub-network model through operations such as cross mutation.
  • determining n first sub-network models according to the super-network model after transfer learning includes: selecting n from the super-network model after transfer learning Obtain the hardware cost of the n fourth sub-network models on the target device; adjust the structure of the n fourth sub-network models based on the hardware cost to obtain the n first Sub-network model.
  • selecting n fourth sub-network models in the super network model after migration may be randomly selecting n fourth sub-network models.
  • adjusting the sub-network model structure according to the hardware overhead of the n fourth sub-network models on the target device may include: adjusting the sub-network model structure according to the probability of the sub-network model structure adjustment, and the adjusted sub-network model can satisfy Target overhead.
  • the probability of adjusting the structure of the sub-network model is determined according to the hardware overhead of the sub-network model.
  • the probability of adjusting the current sub-network model to a smaller sub-network model is greater than the probability of adjusting the current sub-network model to a larger sub-network model.
  • the probability of adjusting the current sub-network model to a larger sub-network model is greater than the probability of adjusting the current sub-network model to a smaller sub-network model.
  • the size of the hardware overhead may be determined relative to the target overhead.
  • a sub-network model that is larger than the target cost can be regarded as a sub-network model with a large hardware cost
  • a sub-network model that is less than the target cost can be regarded as a sub-network model with a small hardware cost.
  • the size of the hardware overhead may also be determined relative to other benchmarks, which is not limited in the embodiment of the present application.
  • the target device may include GPU or NPU.
  • the heuristic search method can be used to perceive the hardware cost of the sub-network model on the target device, adjust the structure of the sub-network model based on the hardware cost, and then perform a search to enable the final sub-network model to be able to Meet the target cost.
  • an image processing method includes: obtaining an image to be processed; using a target neural network model to process the image to be processed to obtain a processing result of the image to be processed;
  • the network model is obtained by searching for the sub-network model.
  • the super network model is obtained by migration learning of the pre-trained super network model based on the target data set.
  • the pre-trained super network model is obtained by training based on the source data set, and the target data set corresponds to The task of is the same as the task corresponding to the source data set.
  • the target neural network model is obtained by the method of the first aspect mentioned above, it is more in line with or close to the application requirements of the neural network model.
  • Using such a neural network model for image classification can achieve a better image classification effect ( For example, the classification results are more accurate, etc.).
  • Even when the target data set is small a super network model with better performance can be obtained, enabling the application of small data scenarios, greatly improving the accuracy of AutoML in small data scenarios, and obtaining target nerves that meet the needs of different users Network model.
  • the pre-trained super network model is obtained through progressive contraction training.
  • the super network model is obtained by performing migration learning on the pre-trained super network model based on the target data set, including: the super network model is obtained from the pre-trained super network Select a sub-network model in the model, calculate the weight gradient of the sub-network model based on the target data set, update the weight of the sub-network model based on the weight gradient of the sub-network model, and obtain the updated sub-network model based on the updated sub-network
  • the model obtains the updated super network model; repeat the above steps until the updated super network model satisfies the termination condition; the termination condition includes at least one of the following: the number of repetitions is greater than or equal to the first iteration number; the updated super network model
  • the reasoning accuracy of the network model is greater than or equal to the first reasoning accuracy.
  • the super network model is obtained by performing migration learning on the pre-trained super network model based on the target data set, including: the super network model is obtained from the pre-trained super network model.
  • ultra network model trained selected N b sub-network model, a network model to calculate the N b sub-data set based on the target weight gradient, the gradient of the update sub-network model based on N b N b weights of the sub-network model weights
  • N b is a positive integer
  • the termination condition is obtained, where the termination condition includes at least one of the following: the number of repetitions is greater than or equal to the first iteration number; and the reasoning accuracy of the updated super network model is greater than or equal to the first reasoning accuracy.
  • the target neural network model is obtained by searching for sub-network models in the super network model, including: the target neural network model is obtained by determining the nth number according to the super network model A sub-network model, n is an integer greater than 1; adjust the structure of n first sub-network models to obtain n second sub-network models; select from n first sub-network models and n second sub-network models n third sub-network models, update n third sub-network models to n first sub-network models; repeat the above steps until n third sub-network models meet the search termination condition; according to n third sub-networks The model is determined; wherein the search termination condition includes at least one of the following: the number of repetitions is greater than or equal to the second iteration number, or the inference accuracy of at least p third sub-network models in the n third sub-network models is greater than or equal to Target accuracy.
  • determining n first sub-network models according to the super network model includes: selecting n fourth sub-network models in the super network model; obtaining n fourth sub-network models The hardware cost of the sub-network model on the target device; adjust the structure of n fourth sub-network models based on the hardware cost to obtain n first sub-network models.
  • a device for obtaining a neural network model includes a module or unit for executing the above-mentioned first aspect and the method in any one of the first aspects.
  • an image processing device in a fourth aspect, includes a module or unit for executing the method in any one of the foregoing second aspect and the second aspect.
  • a device for acquiring a neural network model including: a memory for storing a program; a processor for executing the program stored in the memory, and when the program stored in the memory is executed, The processor is configured to execute the first aspect and the method in any one of the implementation manners of the first aspect.
  • the processor in the fifth aspect mentioned above can be either a central processing unit (CPU), or a combination of a CPU and a neural network model computing processor, where the neural network model computing processor can include a graphics processor ( graphics processing unit (GPU), neural-network processing unit (NPU), tensor processing unit (TPU), etc.
  • the neural network model computing processor can include a graphics processor ( graphics processing unit (GPU), neural-network processing unit (NPU), tensor processing unit (TPU), etc.
  • GPU graphics processing unit
  • NPU neural-network processing unit
  • TPU tensor processing unit
  • TPU is an artificial intelligence accelerator application specific integrated circuit fully customized by Google for machine learning.
  • an image processing device in a sixth aspect, includes: a memory for storing a program; a processor for executing the program stored in the memory, and when the program stored in the memory is executed, the processing The device is used to execute the second aspect and the method in any one of the second aspect.
  • the processor in the sixth aspect mentioned above can be either a central processing unit or a combination of a CPU and a neural network model computing processor.
  • the neural network model computing processor here can include a graphics processor, a neural network model processor, and Zhang Quantity processor and so on.
  • TPU is Google's fully customized artificial intelligence accelerator application specific integrated circuit for machine learning.
  • a computer-readable medium stores program code for device execution, and the program code includes a method for executing any one of the first aspect or the second aspect. .
  • a computer program product containing instructions is provided, when the computer program product runs on a computer, the computer executes the method in any one of the foregoing first aspect or second aspect.
  • a chip in a ninth aspect, includes a processor and a data interface, the processor reads instructions stored in a memory through the data interface, and executes any one of the first aspect or the second aspect above The method in the implementation mode.
  • the chip may further include a memory in which instructions are stored, and the processor is configured to execute instructions stored on the memory.
  • the processor is configured to execute the method in any one of the implementation manners of the first aspect or the second aspect.
  • the aforementioned chip may specifically be a field-programmable gate array (FPGA) or an application-specific integrated circuit (ASIC).
  • FPGA field-programmable gate array
  • ASIC application-specific integrated circuit
  • FIG. 1 is a schematic diagram of an artificial intelligence main body framework provided by an embodiment of the present application
  • FIG. 2 is a schematic structural diagram of a system architecture provided by an embodiment of the application.
  • FIG. 3 is a schematic structural diagram of a convolutional neural network model provided by an embodiment of the application.
  • FIG. 4 is a schematic structural diagram of another convolutional neural network model provided by an embodiment of the application.
  • FIG. 5 is a schematic diagram of the hardware structure of a chip provided by an embodiment of the application.
  • FIG. 6 is a schematic diagram of a system architecture provided by an embodiment of the application.
  • FIG. 7 is a schematic structural diagram of Automl provided by an embodiment of the application.
  • FIG. 8 is a schematic structural diagram of a system for obtaining a neural network model provided by an embodiment of the application.
  • FIG. 9 is a schematic flowchart of a method for obtaining a neural network model provided by an embodiment of this application.
  • FIG. 10 is a schematic block diagram of a super network model provided by an embodiment of this application.
  • FIG. 11 is a schematic flowchart of a progressive shrinkage method provided by an embodiment of the application.
  • FIG. 12 is a schematic flowchart of a method for obtaining a neural network model provided by an embodiment of this application.
  • FIG. 13 is a schematic flowchart of an image processing method provided by an embodiment of the application.
  • FIG. 14 is a schematic block diagram of an apparatus for obtaining a neural network model provided by an embodiment of the present application.
  • FIG. 15 is a schematic block diagram of an image processing device provided by an embodiment of the present application.
  • FIG. 16 is a schematic block diagram of an apparatus for obtaining a neural network model provided by an embodiment of the present application.
  • FIG. 17 is a schematic block diagram of an image processing device provided by an embodiment of the present application.
  • Figure 1 shows a schematic diagram of an artificial intelligence main framework, which describes the overall workflow of the artificial intelligence system and is suitable for general artificial intelligence field requirements.
  • Intelligent Information Chain reflects a series of processes from data acquisition to processing. For example, it can be the general process of intelligent information perception, intelligent information representation and formation, intelligent reasoning, intelligent decision-making, intelligent execution and output. In this process, the data has gone through the condensing process of "data-information-knowledge-wisdom".
  • the infrastructure provides computing power support for the artificial intelligence system, realizes communication with the outside world, and realizes support through the basic platform.
  • the infrastructure can communicate with the outside through sensors, and the computing power of the infrastructure can be provided by smart chips.
  • the smart chip here can be a central processing unit (CPU), a neural network model processor (neural-network processing unit, NPU), a graphics processing unit (graphics processing unit, GPU), and a specialized application integrated circuit (application Specific integrated circuit, ASIC) and field programmable gate array (field programmable gate array, FPGA) and other hardware acceleration chips.
  • CPU central processing unit
  • NPU neural network model processor
  • graphics processing unit graphics processing unit
  • ASIC application Specific integrated circuit
  • FPGA field programmable gate array
  • the basic platform of infrastructure can include distributed computing framework and network related platform guarantee and support, and can include cloud storage and computing, interconnection network, etc.
  • data can be obtained through sensors and external communication, and then these data can be provided to the smart chip in the distributed computing system provided by the basic platform for calculation.
  • the data in the upper layer of the infrastructure is used to represent the data source in the field of artificial intelligence.
  • the data involves graphics, images, voice, text, and IoT data of traditional devices, including business data of existing systems and sensory data such as force, displacement, liquid level, temperature, and humidity.
  • the above-mentioned data processing usually includes data training, machine learning, deep learning, search, reasoning, decision-making and other processing methods.
  • machine learning and deep learning can symbolize and formalize data for intelligent information modeling, extraction, preprocessing, training, etc.
  • Reasoning refers to the process of simulating human intelligent reasoning in a computer or intelligent system, using formal information to conduct machine thinking and solving problems based on reasoning control strategies.
  • the typical function is search and matching.
  • Decision-making refers to the process of making decisions after intelligent information is reasoned, and usually provides functions such as classification, ranking, and prediction.
  • some general capabilities can be formed based on the results of the data processing, such as an algorithm or a general system, for example, translation, text analysis, computer vision processing, speech recognition, image Recognition and so on.
  • Intelligent products and industry applications refer to the products and applications of artificial intelligence systems in various fields. It is an encapsulation of the overall solution of artificial intelligence, productizing intelligent information decision-making and realizing landing applications. Its application fields mainly include: intelligent manufacturing, intelligent transportation, Smart home, smart medical, smart security, autonomous driving, safe city, smart terminal, etc.
  • the embodiments of this application can be applied to many fields in artificial intelligence, for example, smart manufacturing, smart transportation, smart home, smart medical, smart security, automatic driving, safe cities and other fields.
  • the method for obtaining a neural network model in the embodiments of the present application can be specifically applied to automatic driving, image classification, image retrieval, image semantic segmentation, image quality enhancement, image super-resolution and natural language processing, etc. (depth) The domain of the neural network model.
  • recognizing the images in the album can facilitate the user or the system to classify and manage the album and improve the user experience.
  • a neural network model suitable for album classification can be obtained or optimized.
  • the neural network model can be used to classify the pictures, so that different categories of pictures can be labeled, which is convenient for users to view and find.
  • the classification tags of these pictures can also be provided to the album management system for classification management, saving users management time, improving the efficiency of album management, and enhancing user experience.
  • Deep neural network models play an important role in multiple attribute recognition by virtue of their powerful capabilities.
  • a neural network model suitable for attribute recognition in a safe city scene can be obtained or optimized.
  • the neural network model can be used to process the input road images to identify different attribute information in the road images.
  • a neural network model can be composed of neural units.
  • a neural unit can refer to an arithmetic unit that takes x s and intercept 1 as inputs.
  • the output of the arithmetic unit can be:
  • s 1, 2,...n, n is a natural number greater than 1
  • W s is the weight of x s
  • b is the bias of the neural unit.
  • f is the activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network model to convert the input signal in the neural unit into an output signal.
  • the output signal of the activation function can be used as the input of the next convolutional layer, and the activation function can be a sigmoid function.
  • the neural network model is a network formed by connecting multiple single neural units mentioned above, that is, the output of one neural unit can be the input of another neural unit.
  • the input of each neural unit can be connected with the local receptive field of the previous layer to extract the characteristics of the local receptive field.
  • the local receptive field can be a region composed of several neural units.
  • the deep neural network model (deep neural network, DNN), also known as the multi-layer neural network model, can be understood as a neural network model with multiple hidden layers.
  • the DNN is divided according to the positions of different layers.
  • the neural network model inside the DNN can be divided into three categories: input layer, hidden layer, and output layer. Generally speaking, the first layer is the input layer, the last layer is the output layer, and the number of layers in the middle are all hidden layers.
  • the layers are fully connected, that is to say, any neuron in the i-th layer must be connected to any neuron in the i+1th layer.
  • DNN looks complicated, it is not complicated in terms of the work of each layer. Simply put, it is the following linear relationship expression: in, Is the input vector, Is the output vector, Is the offset vector, W is the weight matrix (also called coefficient), and ⁇ () is the activation function.
  • Each layer is just the input vector After such a simple operation, the output vector is obtained Due to the large number of DNN layers, the coefficient W and the offset vector The number is also relatively large.
  • DNN The definition of these parameters in DNN is as follows: Take coefficient W as an example: Suppose in a three-layer DNN, the linear coefficients from the fourth neuron in the second layer to the second neuron in the third layer are defined as The superscript 3 represents the number of layers where the coefficient W is located, and the subscript corresponds to the output third-level index 2 and the input second-level index 4.
  • the coefficient from the kth neuron in the L-1th layer to the jth neuron in the Lth layer is defined as
  • the convolutional neural network model (convolutional neuron network, CNN) is a deep neural network model with a convolutional structure.
  • the convolutional neural network model contains a feature extractor composed of a convolutional layer and a sub-sampling layer.
  • the feature extractor can be regarded as a filter.
  • the convolutional layer refers to the neuron layer that performs convolution processing on the input signal in the convolutional neural network model.
  • a neuron can be connected to only part of the neighboring layer neurons.
  • a convolutional layer usually contains several feature planes, and each feature plane can be composed of some rectangularly arranged neural units. Neural units in the same feature plane share weights, and the shared weights here are the convolution kernels.
  • Sharing weight can be understood as the way of extracting image information has nothing to do with location.
  • the convolution kernel can be formalized in a matrix of random size, and the convolution kernel can obtain reasonable weights through learning during the training process of the convolutional neural network model.
  • the direct benefit of sharing weights is to reduce the connections between the layers of the convolutional neural network model, and at the same time reduce the risk of overfitting.
  • RNN Recurrent Neural Networks
  • RNN can process sequence data of any length.
  • the training of RNN is the same as the training of traditional CNN or DNN.
  • Important equation taking the loss function as an example, the higher the output value (loss) of the loss function, the greater the difference, then the training of the deep neural network model becomes a process of reducing this loss as much as possible.
  • the neural network model can use the error back propagation (BP) algorithm to modify the size of the parameters in the neural network model during the training process, so that the reconstruction error loss of the neural network model becomes smaller and smaller. Specifically, forwarding the input signal to the output will cause error loss, and the parameters in the neural network model updated by backpropagating the error loss information, so as to converge the error loss.
  • the back-propagation algorithm is a back-propagation motion dominated by error loss, and aims to obtain the optimal parameters of the neural network model, such as the weight matrix.
  • an embodiment of the present application provides a system architecture 100.
  • a data collection device 160 is used to collect training data.
  • the training data may include training images and classification results corresponding to the training images, where the results of the training images may be manually pre-labeled results.
  • the data collection device 160 stores the training data in the database 130, and the training device 120 trains to obtain the target model/rule 101 based on the training data maintained in the database 130.
  • the training device 120 processes the input original image and compares the output image with the original image until the output image of the training device 120 differs from the original image. The difference is less than a certain threshold, thereby completing the training of the target model/rule 101.
  • the training device 120 may be used to obtain a pre-trained super network model, migrate the pre-trained super network model based on the target data set, and search for the sub-network model in the migrated super network model to obtain the target model /Rule 101.
  • the target data set may be stored in the database 130.
  • the training device 120 may also be used to pre-train the super network model.
  • the super network model is trained based on the source data set.
  • the source data set may also be stored in the database 130.
  • the above-mentioned target model/rule 101 can be used to implement the image processing method of the embodiment of the present application.
  • the target model/rule 101 in the embodiment of the present application may specifically be a neural network model.
  • the training data maintained in the database 130 may not all come from the collection of the data collection device 160, and may also be received from other devices, such as the target input by the client device 140. data set.
  • the training device 120 does not necessarily perform the training of the target model/rule 101 completely based on the training data maintained by the database 130. It may also obtain training data from the cloud or other places for model training.
  • the above description should not be used as a reference to this application. Limitations of the embodiment.
  • the target model/rule 101 trained according to the training device 120 can be applied to different systems or devices, such as the execution device 110 shown in FIG. 2, which can be a terminal, such as a mobile phone terminal, a tablet computer, notebook computers, augmented reality (AR) AR/virtual reality (VR), vehicle-mounted terminals, etc., can also be servers or clouds.
  • the execution device 110 is configured with an input/output (input/output, I/O) interface 112 for data interaction with external devices.
  • the user can input data to the I/O interface 112 through the client device 140.
  • the input data in this embodiment of the present application may include: a to-be-processed image input by the client device.
  • the preprocessing module 113 is used to perform preprocessing according to the input data (such as the image to be processed) received by the I/O interface 112. In the embodiment of the present application, the preprocessing module 113 may not be provided, and the calculation module 111 is directly used to input Data is processed.
  • the execution device 110 may call data, codes, etc. in the data storage system 150 for corresponding processing .
  • the data, instructions, etc. obtained by corresponding processing may also be stored in the data storage system 150.
  • the I/O interface 112 returns the processing result, such as the classification result of the image obtained above, to the client device 140, so as to provide it to the user.
  • the training device 120 can generate corresponding target models/rules 101 based on different training data for different goals or tasks, and the corresponding target models/rules 101 can be used to achieve the above goals or complete The above tasks provide users with the desired results.
  • the user can manually set input data, and the manual setting can be operated through the interface provided by the I/O interface 112.
  • the client device 140 can automatically send input data to the I/O interface 112. If the client device 140 is required to automatically send the input data and the user's authorization is required, the user can set the corresponding authority in the client device 140.
  • the user can view the result output by the execution device 110 on the client device 140, and the specific presentation form may be a specific manner such as display, sound, and action.
  • the client device 140 can also be used as a data collection terminal to collect the input data of the input I/O interface 112 and the output result of the output I/O interface 112 as new sample data and store it in the database 130 as shown in the figure.
  • the I/O interface 112 directly uses the input data input to the I/O interface 112 and the output result of the output I/O interface 112 as a new sample as shown in the figure.
  • the data is stored in the database 130.
  • FIG. 2 is only a schematic diagram of a system architecture provided by an embodiment of the present application, and the positional relationship between the devices, devices, modules, etc. shown in the figure does not constitute any limitation.
  • the data The storage system 150 is an external memory relative to the execution device 110. In other cases, the data storage system 150 may also be placed in the execution device 110.
  • the target model/rule 101 is obtained by training according to the training device 120.
  • the target model/rule 101 may be the neural network model in the embodiment of the present application.
  • the neural network model constructed in the embodiment of the present application may include CNN, deep convolutional neural network model (deep convolutional neural networks, DCNN), recurrent neural network model (recurrent neural network, RNNS), etc.
  • CNN is a very common neural network model
  • the structure of CNN will be introduced in detail below in conjunction with Figure 3.
  • the convolutional neural network model is a deep neural network model with a convolutional structure. It is a deep learning architecture.
  • the deep learning architecture refers to the update through the neural network model. Algorithm, multi-level learning at different levels of abstraction.
  • CNN is a feed-forward artificial neural network model. Each neuron in the feed-forward artificial neural network model can respond to the input image.
  • a convolutional neural network model (CNN) 200 may include an input layer 210, a convolutional layer/pooling layer 220 (where the pooling layer is optional), and a neural network model layer 230.
  • the input layer 210 can obtain the image to be processed, and pass the obtained image to be processed to the convolutional layer/pooling layer 220 and the subsequent neural network model layer 230 for processing, and the processing result of the image can be obtained.
  • the convolutional layer/pooling layer 220 may include layers 221-226, for example: in an implementation, layer 221 is a convolutional layer, layer 222 is a pooling layer, and layer 223 is a convolutional layer. Layers, 224 is the pooling layer, 225 is the convolutional layer, and 226 is the pooling layer; in another implementation, 221 and 222 are the convolutional layers, 223 is the pooling layer, and 224 and 225 are the convolutional layers. Layer, 226 is the pooling layer. That is, the output of the convolutional layer can be used as the input of the subsequent pooling layer, or as the input of another convolutional layer to continue the convolution operation.
  • the convolution layer 221 can include many convolution operators.
  • the convolution operator is also called a kernel. Its function in image processing is equivalent to a filter that extracts specific information from the input image matrix.
  • the convolution operator is essentially It can be a weight matrix. This weight matrix is usually pre-defined. In the process of convolution on the image, the weight matrix is usually one pixel after one pixel (or two pixels after two pixels) along the horizontal direction on the input image. ...It depends on the value of stride) to complete the work of extracting specific features from the image.
  • the size of the weight matrix should be related to the size of the image. It should be noted that the depth dimension of the weight matrix and the depth dimension of the input image are the same.
  • the weight matrix will extend to Enter the entire depth of the image. Therefore, convolution with a single weight matrix will produce a single depth dimension convolution output, but in most cases, a single weight matrix is not used, but multiple weight matrices of the same size (row ⁇ column) are applied. That is, multiple homogeneous matrices.
  • the output of each weight matrix is stacked to form the depth dimension of the convolutional image, where the dimension can be understood as determined by the "multiple" mentioned above.
  • Different weight matrices can be used to extract different features in the image. For example, one weight matrix is used to extract edge information of the image, another weight matrix is used to extract specific colors of the image, and another weight matrix is used to eliminate unwanted noise in the image.
  • the multiple weight matrices have the same size (row ⁇ column), the size of the convolution feature maps extracted by the multiple weight matrices of the same size are also the same, and then the multiple extracted convolution feature maps of the same size are merged to form The output of the convolution operation.
  • weight values in these weight matrices need to be obtained through a lot of training in practical applications.
  • Each weight matrix formed by the weight values obtained through training can be used to extract information from the input image, so that the convolutional neural network model 200 performs correct predict.
  • the more advanced convolutional layers tend to extract more general features, which can also be called low-level features;
  • the depth of the convolutional neural network model 200 is deepened, and the features extracted by the subsequent convolutional layer (for example, 226) become more and more complex, such as high-level semantic features.
  • the features with higher semantics are more suitable for the solution to be solved. The problem.
  • the 221-226 layers as illustrated by 220 in Figure 3 can be a convolutional layer followed by a layer.
  • the pooling layer can also be a multi-layer convolutional layer followed by one or more pooling layers.
  • the sole purpose of the pooling layer is to reduce the size of the image space.
  • the pooling layer may include an average pooling operator and/or a maximum pooling operator for sampling the input image to obtain an image with a smaller size.
  • the average pooling operator can calculate the pixel values in the image within a specific range to generate an average value as the result of the average pooling.
  • the maximum pooling operator can take the pixel with the largest value within a specific range as the result of the maximum pooling.
  • the operators in the pooling layer should also be related to the image size.
  • the size of the image output after processing by the pooling layer can be smaller than the size of the image of the input pooling layer, and each pixel in the image output by the pooling layer represents the average value or the maximum value of the corresponding sub-region of the image input to the pooling layer.
  • Neural network model layer 230
  • the convolutional neural network model 200 After processing by the convolutional layer/pooling layer 220, the convolutional neural network model 200 is not enough to output the required output information. Because as mentioned above, the convolutional layer/pooling layer 220 only extracts features and reduces the parameters brought by the input image. However, in order to generate final output information (required class information or other related information), the convolutional neural network model 200 needs to use the neural network model layer 230 to generate one or a group of required classes of output. Therefore, the neural network model layer 230 may include multiple hidden layers (231, 232 to 23n as shown in FIG. 3) and an output layer 240. The parameters contained in the multiple hidden layers can be based on specific tasks. The relevant training data of the type is obtained through pre-training. For example, the task type may include image recognition, image classification, image super-resolution reconstruction, and so on.
  • the output layer 240 has a loss function similar to the classification cross entropy, which is specifically used to calculate predictions.
  • a convolutional neural network model (CNN) 200 may include an input layer 210, a convolutional layer/pooling layer 220 (where the pooling layer is optional), and a neural network model layer 230.
  • CNN convolutional neural network model
  • FIG. 3 multiple convolutional layers/pooling layers in the convolutional layer/pooling layer 220 in FIG. 4 are parallel, and the respectively extracted features are input to the neural network model layer 230 for processing.
  • the convolutional neural network model shown in FIG. 3 and FIG. 4 is only used as an example of two possible convolutional neural network models of the image processing method in the embodiment of the present application.
  • the neural network model used in the image processing method of the embodiment of the present application may also exist in the form of other network models.
  • the neural network model obtained by using the neural network model acquisition method of the embodiment of the present application can be used in the image processing method in the embodiment of the present application.
  • FIG. 5 is a hardware structure of a chip provided by an embodiment of the application, and the chip includes a neural network model processor 50.
  • the chip may be set in the execution device 110 as shown in FIG. 2 to complete the calculation work of the calculation module 111.
  • the chip can also be set in the training device 120 as shown in FIG. 2 to complete the training work of the training device 120 and output the target model/rule 101.
  • the algorithms of each layer in the convolutional neural network model as shown in FIG. 3 and FIG. 4 can be implemented in the chip as shown in FIG. 5.
  • the neural network model processor NPU 50 is mounted as a coprocessor to a main central processing unit (central processing unit, CPU) (host CPU), and the main CPU distributes tasks.
  • the core part of the NPU is the arithmetic circuit 503.
  • the controller 504 controls the arithmetic circuit 503 to extract data from the memory (weight memory or input memory) and perform calculations.
  • the arithmetic circuit 503 includes multiple processing units (process engines, PE). In some implementations, the arithmetic circuit 503 is a two-dimensional systolic array. The arithmetic circuit 503 may also be a one-dimensional systolic array or other electronic circuit capable of performing mathematical operations such as multiplication and addition. In some implementations, the arithmetic circuit 503 is a general-purpose matrix processor.
  • the arithmetic circuit fetches the corresponding data of matrix B from the weight memory 502 and caches it on each PE in the arithmetic circuit.
  • the arithmetic circuit fetches the matrix A data and matrix B from the input memory 501 to perform matrix operations, and the partial result or final result of the obtained matrix is stored in an accumulator 508.
  • the vector calculation unit 507 can perform further processing on the output of the arithmetic circuit, such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison, and so on.
  • the vector calculation unit 507 can be used for network calculations in the non-convolutional/non-FC layer of the neural network model, such as pooling, batch normalization, and local response normalization. Wait.
  • the vector calculation unit 507 can store the processed output vector in the unified buffer 506.
  • the vector calculation unit 507 may apply a nonlinear function to the output of the arithmetic circuit 503, such as a vector of accumulated values, to generate the activation value.
  • the vector calculation unit 507 generates a normalized value, a combined value, or both.
  • the processed output vector can be used as an activation input to the arithmetic circuit 503, for example for use in a subsequent layer in a neural network model.
  • the unified memory 506 is used to store input data and output data.
  • the storage unit access controller 505 (direct memory access controller, DMAC) transfers the input data in the external memory to the input memory 501 and/or the unified memory 506, stores the weight data in the external memory into the weight memory 502, and stores the unified memory
  • the data in 506 is stored in the external memory.
  • the bus interface unit (BIU) 510 is used to implement interaction between the main CPU, the DMAC, and the instruction fetch memory 509 through the bus.
  • An instruction fetch buffer 509 connected to the controller 504 is used to store instructions used by the controller 504;
  • the controller 504 is used to call the instructions cached in the memory 509 to control the working process of the computing accelerator.
  • the unified memory 506, the input memory 501, the weight memory 502, and the instruction fetch memory 509 are all on-chip (On-Chip) memories.
  • the external memory is a memory external to the NPU.
  • the external memory can be a double data rate synchronous dynamic random access memory.
  • Memory double data rate synchronous dynamic random access memory, referred to as DDR SDRAM
  • HBM high bandwidth memory
  • the neural network model for example, the calculation of each layer in the convolutional neural network model shown in FIG. 3 and FIG. 4 may be executed by the arithmetic circuit 503 or the vector calculation unit 507.
  • the pre-training operation of the super network model in the embodiment of the present application may be executed by the arithmetic circuit 503 or the vector calculation unit 507.
  • the calculation of the neural network model based on the migration of the target data set pre-trained in the embodiment of the present application may be performed by the arithmetic circuit 503 or the vector calculation unit 507.
  • the calculation of each layer in the target neural network model in the embodiment of the present application may be performed by the arithmetic circuit 503 or the vector calculation unit 507.
  • the execution device 110 in FIG. 2 introduced above can execute each step of the image processing method of the embodiment of the present application, and the chip shown in FIG. 5 may also be used to execute each step of the image processing method of the embodiment of the present application.
  • the training device 110 in FIG. 2 introduced above can execute each step of the method for obtaining a neural network model in an embodiment of the present application, and the chip shown in FIG. step.
  • an embodiment of the present application provides a system architecture 300.
  • the system architecture includes a local device 301, a local device 302, an execution device 310, and a data storage system 350.
  • the local device 301 and the local device 302 are connected to the execution device 310 through a communication network.
  • the execution device 310 may be implemented by one or more servers.
  • the execution device 310 can be used in conjunction with other computing devices, such as data storage, routers, load balancers and other devices.
  • the execution device 310 may be arranged on one physical site or distributed on multiple physical sites.
  • the execution device 310 may use the data in the data storage system 350 or call the program code in the data storage system 350 to implement the neural network model acquisition method or the image processing method in the embodiment of the present application.
  • the execution device 310 may execute the following process:
  • the pre-trained super network model is obtained by training based on the source data set;
  • a target neural network model can be obtained, and the target neural network model can be used for image classification or image processing.
  • Each local device can represent any computing device, such as personal computers, computer workstations, smart phones, tablets, smart cameras, smart cars or other types of cellular phones, media consumption devices, wearable devices, set-top boxes, game consoles, etc.
  • Each user's local device can interact with the execution device 310 through a communication network of any communication mechanism/communication standard.
  • the communication network can be a wide area network, a local area network, a point-to-point connection, or any combination thereof.
  • the local device 301 and the local device 302 obtain the relevant parameters of the target neural network model from the execution device 310, deploy the target neural network model on the local device 301 and the local device 302, and use the target neural network model Perform image classification or image processing, etc.
  • the target neural network model can be directly deployed on the execution device 310.
  • the execution device 310 obtains the image to be processed from the local device 301 and the local device 302, and uses the target neural network model to classify the image to be processed or other types Image processing.
  • the foregoing execution device 310 may also be a cloud device. In this case, the execution device 310 may be deployed in the cloud; or, the foregoing execution device 310 may also be a terminal device. In this case, the execution device 310 may be deployed on the user terminal side. This is not limited.
  • the cloud platform based on automatic machine learning can perform network design and search according to the restriction conditions set by the user, and provide the user with the network model training obtained from the network design and search.
  • Restrictions can include the type of network model, the accuracy of the network model, the delay of the network model, and the operating platform of the network model.
  • a neural network model can be obtained according to user requirements, the performance of the obtained neural network model is improved, and the processing efficiency in the process of obtaining the neural network model is improved.
  • Figure 7 shows a schematic structural diagram of the AutoML framework.
  • AutoML usually defines a search space in advance, and the search space refers to the searchable range.
  • AutoML continuously generates sub-network model configurations in the search space, and forms a closed loop of evaluation-feedback-generating sub-network model configurations again, until the final search results in an excellent neural network model.
  • the search space is determined according to the specific tasks of AutoML.
  • the specific task is a neural network model
  • the search space can include multiple neural network model structural units, and the final neural network model is a combination of these neural network model units in the search space. Forming.
  • the controller 710 is configured to select different configurations in the search space and assign them to the evaluator 720 for evaluation, and then perform a policy update according to the evaluation result of the metric feedback of the evaluator 720, or in other words, perform a configuration update.
  • the controller 710 may select a neural network model structural unit in the search space, or search for a neural network model structural unit, and combine to obtain one or more sub-network models, and select a sub-network model from the combined sub-network models , And assign the sub-network model configuration to the evaluator 720 for evaluation.
  • the evaluator 720 is used to evaluate the performance indicators of different configurations, and feed back the obtained evaluation results to the controller 710.
  • the evaluator 720 may evaluate the performance index of the sub-network model selected by the controller 710.
  • the performance indicators may include the accuracy of the neural network model and the time delay of the network model.
  • the evaluation result is fed back to the optimizer for the controller 710 to update the configuration until the target neural network model is obtained.
  • the training cost of AutoML is much higher than the training cost of ordinary neural network models.
  • the training cost of AutoML such as training time and calculation amount is at least one order of magnitude higher than that of ordinary neural network models.
  • AutoML usually requires a lot of data to get an excellent neural network model. In some small data scenarios, it is often difficult for AutoML to directly train and produce an excellent model.
  • Transfer learning can apply knowledge or patterns learned in a certain field or task to different but related fields or problems.
  • transfer learning can enable neural network models to empower small data sets and reduce training resources.
  • the transfer of neural network models through transfer learning may not meet the needs of users. For example, the user's spending needs cannot be met.
  • the neural network model obtained before the migration may not be able to meet the user's cost requirements, and the cost of the neural network model after the migration is basically the same as that of the neural network model before the migration. If the neural network model before the migration cannot meet the user's cost requirements, Naturally, the migrated neural network model cannot meet the user's expense requirements. Another example is that the accuracy requirements of users cannot be met.
  • the neural network model before the migration is obtained by training based on the source data set, and the architecture of the neural network model obtained after the migration is basically unchanged. If the architecture is inherently insufficient or not suitable for the target task, then blindly tuning the parameters based on the target data set will not significantly improve the effect of the neural network model, that is, the accuracy of the neural network model of the architecture may not meet the accuracy requirements of users .
  • FIG. 8 is a schematic block diagram of a system 800 for obtaining a neural network model according to an embodiment of the present application.
  • the system 800 for obtaining the neural network model may be a cloud service device or a mobile terminal, for example, a device with sufficient computing power such as a computer and a server to obtain the neural network model, or a system composed of a cloud service device and a mobile terminal.
  • the system 800 for acquiring a neural network model mainly includes: a pre-training module 810, an input module 820, a migration module 830, a search module 840, a test module 850, and an output module 860.
  • the pre-training module 810 may be used to pre-train the super network model to obtain the weight of the super network model.
  • the super network model refers to a model that can cover all sub-network models in the search space.
  • the weight of the super network model is the weight of all sub-network models. In other words, the weight of the sub-network model can be obtained from the super-network model.
  • the super network model may be a pre-defined AutoML super network model.
  • the super network model can be defined according to the tasks that the neural network model needs to perform.
  • the pre-training module 810 is an optional module, and the pre-training process can be completed by other devices.
  • the migration module 830 may receive a super network model pre-trained by other devices.
  • the pre-training may be offline training, that is, the pre-training process may be completed in the offline phase.
  • the online and offline in the embodiment of the present application may be at different stages relative to the user.
  • the system 800 in the offline phase is not affected by the user, and the super network model obtained by offline training can be stored for later processing in the online phase.
  • the system 800 in the online phase can accept the user's input and perform corresponding operations according to the user's input.
  • the pre-training process may be completed in the offline stage.
  • the user uses the system 800 to obtain the required neural network model, he can directly obtain the pre-trained super network model through the migration module 830 without performing online pre-training. Operation.
  • the pre-training module 810 may be located on a cloud server or on a local device.
  • the pre-training module 810 pre-trains the super network model based on the source data set.
  • the source data set may be a data set related to tasks that the target neural network model needs to perform.
  • the source data set may include the source sample image and the classification label of the source sample image.
  • the source data set may be the public data set ImageNet.
  • the input module 820 may be used to receive user input data.
  • the input module 820 may receive any one or more of the following: target data set, hyperparameters, target cost, target accuracy, target search duration, target loss function, and so on.
  • the target data set is used for fine-tuning the super network model output by the pre-training module 810.
  • hyperparameters of the neural network model include parameters that do not change during the training process of the neural network model. Hyperparameters are not obtained through the training of the neural network model, and are usually determined before the training of the neural network model.
  • the hyperparameters of the neural network model include: the learning rate of the neural network model, the label smooth coefficient of the neural network model, or the dropout parameter of the neural network model.
  • the target cost refers to the hardware cost of the target neural network model output by the output module 860 on the target device.
  • the target accuracy refers to the inference accuracy of the target neural network model output by the output module 860.
  • the target search time refers to the search time of searching the sub-network model in the super network model to obtain the target neural network model.
  • the target loss function is used to fine-tune the super network model output by the pre-training module 810.
  • the migration module 830 may be used to perform migration learning on the pre-trained super network model based on the target data set. Or it can be understood that the weight of the super network model obtained by the pre-training module 810 is migrated to the target data set.
  • the migration module 830 can be located on the cloud server or on the local device.
  • the migration module 830 may receive the pre-trained super network model sent by the pre-training module 810. For example, if the migration module 830 and the pre-training module 810 are located in different devices, the pre-trained super network model can be transmitted between the migration module 830 and the pre-training module 810 through the communication network.
  • the migration module 830 may fine-tune the weight of the super network model obtained by the pre-training module 810 based on the target data set.
  • the target data set may be the data set input by the input module 820.
  • the migration module 830 may load the weights of the super network model output by the pre-training module 810, and fine-tune the super network model according to the target data set input by the user.
  • the migration module 830 may load the weight of the super network model output by the pre-training module 810, and fine-tune the super network model according to the target loss function input by the user.
  • the search module 840 may be used to search for the sub-network model in the super network model output by the migration module 830 to obtain the target neural network model.
  • the search module 840 can be located on the cloud server or on the local device.
  • the performance index of the target neural network model can meet the target performance index. That is, the search module 840 can search for a target sub-network model whose performance index meets the target performance index in the super network model, and determine the target neural network model according to the target sub-network model.
  • the performance index of the sub-network model may include the inference accuracy of the sub-network model, the hardware cost of the sub-network model, or the inference time length of the sub-network model, etc.
  • Target performance indicators can include target accuracy, target cost, or target reasoning time.
  • the target performance index may be a default or input through the input module 820.
  • the user can input a desired target performance index through the input module 820.
  • the target sub-network model may be a sub-network model whose reasoning accuracy reaches the target accuracy.
  • the target sub-network model may be a sub-network model in which the hardware cost reaches the target cost and the reasoning accuracy reaches the target accuracy.
  • the test of the hardware overhead of the sub-network model may be performed by the test module 850.
  • the test module 850 is located on the target device. In other words, the sub-network model is deployed on the target device, and the hardware overhead is tested by the test module 850.
  • the search duration of the search module 840 can satisfy the target search duration.
  • the search module 840 may search for a target sub-network model whose performance index meets the target performance index in the super network model, and the search duration meets the target search duration, and determine the target neural network model according to the target sub-network model.
  • the test module 850 is used to test the hardware overhead of different sub-network models on the target device. It should be understood that the test module 850 is an optional module. The test module 850 is located on the target device.
  • the output module 860 is used to output the target neural network model obtained by the search module 840.
  • the method 900 for obtaining a neural network model will be described in detail below with reference to FIG. 9.
  • the method shown in FIG. 9 may be executed by an apparatus for obtaining a neural network model, for example, executed by the training device 120 shown in FIG. 2, or executed by the system 800 shown in FIG. 8.
  • the device for acquiring the neural network model may be a cloud service device or a mobile terminal.
  • a computer, a server, etc. have sufficient computing power to execute the method 900 device, or it may be a system composed of a cloud service device and a mobile terminal.
  • the method 900 includes steps S910 to S940. Steps S910 to S940 will be described in detail below.
  • the pre-training super network model can also be understood as the weight of the pre-training super network model.
  • step S910 may be performed by the pre-training module 810 in FIG. 8 or the training device 120 in FIG. 2. It should be understood that this is only for illustration, and in this embodiment of the present application, step S910 may also be performed by other devices. That is to say, the pre-trained super network model obtained in step S930 may be a model trained by other devices.
  • the super network model may also be referred to as a super network or a super model.
  • the super network model refers to a model that can cover all sub-network models in the search space.
  • the weight of the super network model is the weight of all sub-network models. In other words, the weight of the sub-network model can be obtained from the super-network model.
  • the neural network model is formed by stacking multiple layers of operators.
  • the neural network model can be represented by a directed acyclic graph formed by stacking multiple layers of operators.
  • each layer is a node
  • the operator at each node is a single operator.
  • Each layer in the super network model includes multiple operators, that is to say, there are multiple candidate operators at each node.
  • the operators between layers are connected in a fully connected manner, and a path in the full connection is It is a sub-network model.
  • the neural network model composed of the selected operators in the multiple layers is a sub-network model.
  • Update the weight in the path that is, update the weight of the sub-network model, and also update the weight of the super-network model, that is, the effect of training the super-network model is achieved.
  • the candidate operators of one layer include candidate operators such as operator 411 and operator 412.
  • the candidate operators in this layer can all be convolution, and operator 411 and operator The number of channels of sub 412 can be different.
  • the operator refers to the basic unit of neural network model calculation.
  • the operator can also be understood as the structural unit of the neural network model or the "block" in the neural network model.
  • the relationship between the aforementioned super-network model and sub-network model can also be understood as that each layer of the super-network model contains multiple blocks for selection. Choose a block in each layer and combine the selected blocks to form A sub-network model.
  • the operators may include: activation operators, feature extraction operators, normalization operators, overfitting prevention operators, and the like.
  • the activation operator may include: rectified linear unit (ReLU), sigmoid, and so on.
  • Feature extraction operators may include: convolution, full connection, and so on.
  • the normalization operator may include: batch normalization and so on.
  • the anti-overfitting operator may include: pooling and so on.
  • the network topology of the sub-network model in the super network model may be the same. Specifically, the direction of the data flow between the blocks constituting the sub-network model may be the same.
  • the size of the convolution kernel, the number of layers, or the number of channels of the sub-network model in the super network model can be different.
  • the super network model can be predefined.
  • the search space can be predefined according to the tasks to be performed by the target neural network model.
  • the search space can be used as the searchable range in step S940.
  • the super network model can be defined according to the tasks that the target neural network model needs to perform.
  • the number or types of operators that can be selected in the super network model can be determined according to the tasks to be performed by the target neural network.
  • the tasks required by the target neural network may include image classification, image segmentation, or target detection.
  • step S910 can be completed in the offline phase. For example, when a user obtains a desired neural network model, he can directly obtain a pre-trained super network model without performing pre-training operations online.
  • step S910 is an optional step.
  • the method 900 can be executed from step S920.
  • the super network model can be pre-trained based on the source data set.
  • the source data set can be a data set with a large amount of data, which can ensure that the super network model is fully trained, and a more accurate super network model can be obtained.
  • the source data set may be a data set related to the task to be performed by the target neural network model.
  • the source data set may include source sample data and tags corresponding to the source sample data.
  • the source data set may include the source sample image and the classification label of the source sample image.
  • the source data set may be the public data set ImageNet.
  • the source data set may include the source sample image and the classification labels of the pixels in the source sample image.
  • the source data set may include the source sample image and the object classification label and the bounding box of the object in the source sample image.
  • step S910 includes: pre-training the super network model based on a single path algorithm.
  • the super network model is pre-trained based on a progressive shrinking (progressive shrinking, PS) algorithm.
  • Figure 11 shows a schematic diagram of a progressive shrinking algorithm. As shown in Figure 11, first train the largest sub-network model, and then gradually train the sub-network model with a variable convolution kernel, the sub-network model with a variable number of layers, and the sub-network model with a variable number of channels.
  • the largest sub-network model refers to a sub-network model with the largest convolution kernel (kernel), the largest number of layers (depth), and the largest number of channels (width) in the super network model.
  • training a sub-network model with a variable convolution kernel may be performed by sampling multiple sub-network models with the largest number of training layers and the largest number of channels from the super-network model. That is to say, in the training stage, the number of layers of the multiple sub-network models to be trained is D and the number of channels is W, and the size of the convolution kernels of the multiple sub-network models may be different. Among them, D represents the maximum number of layers in the super network model, and W represents the maximum number of channels in the super network model.
  • training a sub-network model with a variable convolution kernel may be trained by using multiple sub-network models with the largest number of sampling layers, the largest number of channels, and different convolution kernel sizes using a random single-pass algorithm.
  • the sub-network model with a variable number of training layers may be trained for multiple sub-network models with the largest number of sampling channels from the super-network model. That is to say, in the training stage, the number of tracks of the multiple sub-network models to be trained is W, the size of the convolution kernel of the multiple sub-network models may be different, and the number of layers of the multiple sub-network models may be different. Among them, W represents the maximum number of channels in the super network model.
  • the sub-network model with a variable number of training layers may be trained by using multiple sub-network models with the largest number of sampling channels, different number of layers, and different convolution kernels using the random single-path method.
  • the sub-network model with a variable number of training channels can be trained by randomly sampling different sub-network models.
  • the sub-network model with a variable number of training channels may be trained by sampling multiple sub-network models from the super network model. That is, in the training stage, the convolution kernel sizes of the multiple sub-network models to be trained may be different, the number of layers of the multiple sub-network models may be different, and the number of channels of the multiple sub-network models may be different.
  • a sub-network model with a variable number of training channels may be trained by sampling different sub-network models using a random single-path method.
  • a sub-network model with a variable convolution kernel, a sub-network model with a variable number of layers, or a sub-network model with a variable number of channels can be trained by means of knowledge distillation.
  • Knowledge distillation refers to the transfer of knowledge from one neural network model to another neural network model.
  • the knowledge of the neural network model can be understood as the mapping relationship from input to output.
  • the mapping relationship between input and output in the neural network model is determined based on the parameters of the neural network model.
  • knowledge distillation can be understood as transferring the parameters of one neural network model to another neural network model.
  • knowledge distillation refers to training a student network model using the output of a trained teacher network model and the true labels of the training samples.
  • the teacher network model refers to the largest sub-network model
  • the student network model refers to the sub-network model with a variable convolution kernel, the sub-network model with a variable number of layers, or a sub-network with a variable number of channels. Model.
  • training a sub-network model with a variable convolution kernel through knowledge distillation means that the source data is input to the trained maximum sub-network model to obtain the output value of the maximum sub-network model, based on the output value and the source
  • the label corresponding to the sample data trains a sub-network model with a variable convolution kernel.
  • the progressive shrinking algorithm training reduces the mutual influence of different sizes of sub-network models during the training process, and the obtained super-network model can support a variety of different architecture settings.
  • a variety of different architecture settings include different numbers of layers, different numbers of channels, and sub-network models of different convolution kernel sizes.
  • S920 Acquire input data, where the input data includes a target data set.
  • the input data may also include any one or more of the following: hyperparameters, target overhead, target accuracy, target search duration, or target loss function.
  • this step may be performed by the input module 820 in FIG. 8.
  • the target data set may be determined according to the tasks to be performed by the target neural network model. That is to say, the tasks performed by the sub-network model in the pre-trained super network model and the target neural network model are consistent. Or, it can be understood that the task corresponding to the source data set is the same as the task corresponding to the target data set. For example, both are used for image classification; or both are used for image segmentation; or both are used for target detection.
  • the target data set may include the target sample image and the classification label of the target sample image.
  • the target neural network model is used to realize vehicle recognition. Then the target data set may include the target vehicle image and the classification label of the target vehicle image.
  • the target data set may include the target sample image and the classification labels of the pixels in the target sample image.
  • the target data set may include a target detection data set.
  • the target detection data set may include the target sample image, the object classification label in the target sample image, and the bounding box of the object.
  • the target data set may be a data set input by a user, or a data set obtained from another device.
  • step S920 is executed by a cloud service device
  • other devices may be target devices.
  • the target device may be the device to be deployed by the target neural network model.
  • S930 Perform migration learning on the pre-trained super network model based on the target data set to obtain the super network model after migration learning.
  • Performing migration learning on the pre-trained super network model based on the target data set can fine-tune the pre-trained super network model based on the target data set.
  • Fine-tuning refers to applying a pre-trained model to the target data set and adapting the parameters of the model to the target data set.
  • the transfer learning of the pre-trained super network model specifically refers to the transfer of the weights of the pre-trained super network model.
  • step S930 may be performed by the migration module 830 in FIG. 8 or the training device 120 in FIG. 2. It should be understood that this is only for illustration, and in this embodiment of the present application, step S930 may also be performed by other devices.
  • the pre-trained super network model is obtained, or the weight of the pre-trained super network model is loaded, and the super network model is fine-tuned based on the target data set.
  • the device that performs step S910 and the device that performs step S930 may be the same or different.
  • the pre-trained super network model may be transmitted through the communication network.
  • step S930 includes fine-tuning the pre-trained super network model through a single-path algorithm.
  • the sub-network model can be uniformly sampled and trained to improve the training effect.
  • the memory space can be reduced and efficient training can be achieved.
  • Fine-tune the pre-trained super network model through a single-channel algorithm including:
  • Select a sub-network model from the pre-trained super network model calculate the weight gradient of the sub-network model based on the target data set, update the weight of the sub-network model based on the weight gradient of the sub-network model, and obtain the updated super network model; repeat the above Steps, until the updated super network model satisfies the termination condition, and the super network model after transfer learning is obtained.
  • selecting a sub-network model from the super-network model can be randomly selecting a sub-network model from the super-network model.
  • the following takes an iterative training process as an example to illustrate the method for updating the weight of the currently selected sub-network model.
  • a sub-network model in the super network model is selected during each forward propagation, that is, the target sample data is input into the sub-network model, and the output corresponding to the sub-network model is calculated by the loss function Loss value (loss), and calculate the weight gradient of the current sub-network model according to the back propagation of the loss value, and adjust the weight of the sub-network model according to the weight gradient.
  • Loss Loss
  • the function value of the loss function is used to indicate the difference between the classification label of the target sample image and the predicted label output by the sub-network model.
  • the weight of the sub-network model is updated according to the difference between the two, until the predicted label of the neural network model is very close to the label of the training data. For example, the higher the function value of the loss function, the greater the difference, so the training of the neural network model becomes a process of reducing the value of this function as much as possible.
  • the loss function can also be the objective function.
  • step S930 includes: selecting N b models from the super network model pre-trained, based on the weight of the target data set computing N b sub-network model weight gradient, the gradient update pretraining weight based on N b sub-network model heavy super The weight of the network model obtains the updated super network model; the above steps are repeated until the updated super network model meets the termination condition, and the super network model after migration learning is obtained.
  • each time a sub-network model is selected only the single-path sub-network model is activated, or it can be understood that each time a sub-network model is selected from the super-network model, only one sub-network model is selected.
  • the following takes an iterative process as an example to illustrate the method of updating the weights of the super network model.
  • a sub-network model is selected from the super network model, that is, the target sample image is input into the sub-network model, and the function value of the loss function is calculated.
  • the weight gradient of the current sub-network model is calculated.
  • N b times that is, select the N b sub-network model, calculate the weight gradient of the N b sub-network model, and accumulate the weight gradient obtained N b times.
  • the weight of the super network model is updated once.
  • This process can be regarded as an iterative process. Continue to iterate until the termination condition is met, and the super network model after migration learning is obtained, that is, the migration of the super network model is completed.
  • N b is a positive integer.
  • N b can be preset or input by the user.
  • the function value of the loss function is used to indicate the difference between the classification label of the target sample image and the predicted label output by the sub-network model
  • the weight gradient accumulated for N b times can satisfy:
  • dW represents the weight gradient of the super network model
  • L represents the function value of the loss function during the i-th forward propagation in an iteration process.
  • updating the weight of the super network model once may include: updating the weight of the super network model by subtracting the accumulated weight gradient from the weight of the current super network model.
  • updating the weight of the super network model once may include: updating the weight of the super network model by subtracting the product of the accumulated weight gradient and the learning rate from the weight of the current super network model.
  • the weights of the current super network model can satisfy:
  • W j is the weight of the super network model after the jth iteration
  • W j-1 is the weight of the super network model after the j-1th iteration
  • lr represents the learning rate
  • the termination condition includes that the number of repetitions is greater than or equal to the first number of iterations.
  • the number of iterations can also be understood as the number of updates of the weights of the super network model.
  • the termination condition includes that the inference accuracy of the updated super network model is greater than or equal to the first inference accuracy.
  • the reasoning accuracy of the super network model may be the reasoning accuracy of at least one sub-network model in the super network model.
  • the termination condition may include: within a preset time interval, the change value of the inference accuracy of the z sub-network models is less than a set threshold.
  • the z sub-network models may be z pre-designated sub-network models. In other words, z sub-network models can be specified in advance, and at each iteration, the accuracy of the z sub-network models are tested. Within a period of time or within a certain number of iterations, the accuracy of the z sub-network models does not change. Obviously, the migration can be terminated, that is, the migration of the super network model is completed.
  • the loss function in step S930 may be a preset loss function or a target loss function input by the user.
  • step S930 the above fine-tuning method is only an example, and other methods that can fine-tune the pre-trained super network model are all applicable to step S930, and the embodiment of the present application does not limit the method of fine-tuning the super network model.
  • S940 Search for a sub-network model from the super-network model after transfer learning to obtain a target neural network model.
  • this step may be performed by the search module 840 of FIG. 8 or the training device 120 of FIG. 2.
  • the performance index of the target neural network model can meet the target performance index. That is to say, the target sub-network model can be searched in the super network model after transfer learning, and the target neural network model can be determined according to the target sub-network model.
  • the target sub-network model may be a sub-network model whose performance index meets the target performance index.
  • the performance index of the sub-network model may include the inference accuracy of the sub-network model, the hardware cost of the sub-network model, or the inference time length of the sub-network model, etc.
  • Target performance indicators can include target accuracy, target cost, or target reasoning time.
  • the target cost refers to the hardware cost of the target neural network model on the target device.
  • the target accuracy refers to the inference accuracy of the target neural network model.
  • the target reasoning time refers to the reasoning time of the target neural network model.
  • the target performance index may be a preset target performance index or a target performance index input by the user.
  • step S920 further includes obtaining the target cost, target accuracy, target reasoning time, and the like.
  • the target sub-network model may be a sub-network model whose reasoning accuracy reaches the target accuracy.
  • the target sub-network model may be a sub-network model in which the hardware cost reaches the target cost and the reasoning accuracy reaches the target accuracy.
  • the test of the hardware overhead of the sub-network model may be performed by the test module 850.
  • the sub-network model can be deployed on the target device to test its hardware overhead.
  • the search duration for the target sub-network model can satisfy the target search duration.
  • the target search duration refers to the search duration of the sub-network model obtained by searching in the super network model.
  • the target sub-network model whose performance index meets the target performance index is searched, and the search time length meets the target search time length, and the target neural network model is determined according to the target sub-network model.
  • the target search duration may be a preset target search duration or a target search duration input by the user.
  • the sub-network model is searched in the super-network model after transfer learning through the reinforcement learning algorithm to obtain the target neural network model.
  • an evolutionary algorithm is used to search for a sub-network model in the super-network model after migration learning to obtain the target neural network model.
  • Searching for sub-network models in the super-network model after migration learning through evolutionary algorithms can include:
  • Step 1 Determine n first sub-network models according to the super network model after transfer learning. These n first sub-network models can be used as the initial population.
  • the n sub-network models are the n first sub-network models.
  • Step 2 Adjust the structure of n first sub-network models to obtain n second sub-network models.
  • adjusting the structure of the n first sub-network models may be adjusting the structure of the first sub-network model through operations such as cross mutation.
  • Step 3 Select n third sub-network models from n first sub-network models and n second sub-network models, and use n third sub-network models as the n first sub-network models in step 2.
  • the n third sub-network models are the new population.
  • n is a positive integer greater than 1.
  • n can be preset or input by the user.
  • the value of n can be obtained through the input module 820.
  • the search termination condition may be preset or determined based on user input data.
  • the search termination condition may be that the number of repetitions is greater than or equal to the second number of iterations.
  • the number of second iterations may be preset or input by the user.
  • the search termination condition may be that the accuracy of at least p third sub-network models in the n third sub-network models meets the target accuracy.
  • p is a positive integer, and p ⁇ n.
  • p can be preset or input by the user.
  • the value of p can be obtained through the input module 820.
  • the search termination condition may be that the search duration reaches the target search duration.
  • the search termination condition may be that the hardware cost of at least q third sub-network models in the n third sub-network models meets the target cost.
  • q is a positive integer, and q ⁇ n.
  • q can be preset or input by the user.
  • the value of q can be obtained through the input module 820.
  • decimation can also be understood as “sampling”.
  • determining n first sub-network models according to the super network model after transfer learning may include:
  • n fourth sub-network models is adjusted based on the hardware cost, and n first sub-network models are obtained.
  • the following example illustrates the method of searching the sub-network model in the super network model after transfer learning to obtain the target neural network model.
  • n sub-network models are extracted from the super network model after migration learning, and the structure of the sub-network model is adjusted according to the hardware cost of the n sub-network models on the target device to obtain the adjusted n sub-network models. Taking the adjusted n sub-network models as the initial population, using cross mutation to generate new n sub-network models, selecting n sub-network models from the 2n sub-network models to form a new population, and continuing to iterate until the search termination condition is satisfied. The n sub-network models obtained at the end are the search results.
  • n is a positive integer greater than 1.
  • n can be preset or input by the user.
  • the value of n can be obtained through the input module 820.
  • the search termination condition may be preset or determined based on the user's input.
  • the search termination condition may be that the current number of iterations reaches the second number of iterations.
  • the number of second iterations may be preset or input by the user.
  • the search termination condition may be that the accuracy of at least p of the n sub-network models obtained in the current iteration meets the target accuracy.
  • p is a positive integer, and p ⁇ n.
  • p can be preset or input by the user.
  • the value of p can be obtained through the input module 820.
  • the iteration termination condition may be that the search duration reaches the target search duration.
  • the iteration termination condition may be that the hardware cost of at least q sub-network models among the n sub-network models obtained in the current iteration meets the target cost.
  • q is a positive integer, and q ⁇ n.
  • q can be preset or input by the user.
  • the value of q can be obtained through the input module 820.
  • search termination conditions are only examples, and the search termination conditions can be set as required.
  • the search termination conditions can include the above two conditions.
  • extracting n sub-network models from the super network model may include: randomly extracting n sub-network models from the super network model.
  • selecting n sub-network models from 2n sub-network models to form a new population may include multiple ways.
  • the following example illustrates how to select n sub-network models from 2n sub-network models.
  • selecting n sub-network models from the 2n sub-network models to form a new population may include: randomly selecting n sub-network models from the 2n sub-network models to form a new population.
  • selecting n sub-network models from the 2n sub-network models to form a new population may include: testing the hardware overhead of the 2n sub-network models, and selecting n sub-network models to form the new population according to the limitation of the hardware overhead.
  • selecting n sub-network models from the 2n sub-network models to form a new population may include: testing the inference accuracy of the 2n sub-network models, and selecting n sub-network models with the highest inference accuracy to form the new population.
  • selecting n sub-network models from the 2n sub-network models to form a new population may include: testing the inference accuracy and hardware cost of the 2n sub-network models, and selecting n sub-network models with the highest inference accuracy within the limit of the hardware cost Form a new population.
  • adjusting the sub-network model structure according to the hardware cost of the n sub-network models on the target device to obtain the adjusted n sub-network models includes: adjusting the sub-network model structure according to the probability of the sub-network model structure adjustment, and adjusting The latter sub-network model can meet the target cost.
  • the probability of adjusting the structure of the sub-network model is determined according to the hardware overhead of the sub-network model.
  • the probability of adjusting the current sub-network model to a smaller sub-network model is greater than the probability of adjusting the current sub-network model to a larger sub-network model.
  • the probability of adjusting the current sub-network model to a larger sub-network model is greater than the probability of adjusting the current sub-network model to a smaller sub-network model.
  • the size of the hardware overhead may be determined relative to the target overhead.
  • a sub-network model that is larger than the target cost can be regarded as a sub-network model with a large hardware cost
  • a sub-network model that is less than the target cost can be regarded as a sub-network model with a small hardware cost.
  • the size of the hardware overhead may also be determined relative to other benchmarks, which is not limited in the embodiment of the present application.
  • the target device may include GPU or NPU.
  • the heuristic search method can be used to perceive the hardware cost of the sub-network model on the target device, and adjust the structure of the sub-network model based on the hardware cost, so that the final sub-network model can meet the target cost.
  • the target sub-network model is fed back to the user.
  • the target sub-network model can be one sub-network model or multiple sub-network models.
  • the target sub-network model includes m sub-network models. Among them, m is a positive integer, m ⁇ n.
  • the n sub-network models in the search result include the m sub-network models. m can be preset or input by the user.
  • search results can be fed back to the user according to the needs of the user.
  • the following is an example of how to feed back search results to users.
  • m sub-network models may be fed back to the user, and the user selects the desired sub-network model as the target neural network model.
  • the m sub-network models fed back to the user may be m sub-network models randomly selected from the search results.
  • the m sub-network models fed back to the user may be the m sub-network models with the highest accuracy in the search result, and further, the m sub-network models may be fed back to the user based on the accuracy ranking.
  • the m sub-network models fed back to the user may be the m sub-network models with the least cost in the search result, and further, the m sub-network models may be fed back to the user based on the cost-based ranking.
  • the m sub-network models fed back to the user may be the m sub-network models with the highest accuracy within the target cost range.
  • the accuracy of the m sub-network models can be fed back to the user, and the user selects the desired sub-network model as the target neural network model.
  • the cost of the m sub-network models can be fed back to the user, and the user can select the desired sub-network model as the target neural network model.
  • the neural network model that meets the user's needs can be obtained by searching the sub-network model in the super network model, and the target data set can be adapted to meet the user's needs, for example , To meet the user's cost/precision requirements.
  • the weight of the super network model is shared between different data sets.
  • the source data set and the target data set are data sets related to the same task, which can realize the efficient migration learning of AutoML, and only the weight of the super network model is used during migration. Fine-tuning does not need to adjust the structure of the super network model, which can greatly improve the migration efficiency of AutoML, reduce the time required for training by at least an order of magnitude, and even reach the training time of ordinary neural network models.
  • the migration time of the super network model provided in the embodiment of the present application is close to the migration time of the ordinary neural network model. That is to say, compared with the method of obtaining the target neural network model through the transfer learning of the ordinary neural network model, the method of obtaining the neural network model in the embodiments of the present application can meet the requirements of users' refinement under the same training time. Cost/precision requirements. Under the same cost, the target neural network model with higher accuracy is obtained.
  • the super network model trained based on the progressive contraction method can support a variety of different architecture settings. After the super network model training is completed, a suitable sub-network model can be selected from the super network model without additional training.
  • a suitable sub-network model can be selected from the super network model without additional training.
  • the sub-network model does not need to be retrained, and the accuracy of the sub-network model can also be guaranteed to meet the requirements of pre-training.
  • FIG. 12 shows a schematic flowchart of a method for obtaining a neural network model according to an embodiment of the present application.
  • the method in FIG. 12 includes step S1110 to step S1140.
  • the method in FIG. 12 can be regarded as an embodiment of the method 900 in FIG. 9.
  • the method in FIG. 12 is divided into two phases, an offline phase and an online phase, for description below.
  • the predefined search space is the predefined super network model.
  • the super network model is pre-trained based on the source data set.
  • the source data set may be a data set related to tasks that the target neural network model needs to perform.
  • the source data set may include the source sample image and the classification label of the source sample image.
  • the source data set can be the public data set ImageNet.
  • the target data set may be determined according to the tasks to be performed by the target neural network model.
  • the target data set may include the target sample image and the classification label of the target sample image.
  • the target data set may be a target data set input by the user.
  • S1140 Perform migration learning on the pre-trained super network model based on the target data set.
  • the pre-trained super network model can be migrated to different target data sets.
  • Step S1140 includes performing migration learning on the pre-trained super network model based on the bird data set to obtain the super network model 1 after migration learning, and performing migration learning on the pre-trained super network model based on the vehicle data set to obtain the post-transfer learning The super network model 2.
  • S1150 Search in the super network model after transfer learning to obtain the target neural network model.
  • different user requirements may include that the inference accuracy of the target neural network 1 reaches the target accuracy 1, and the inference accuracy of the target neural network 2 reaches the target accuracy 2.
  • the above description only takes the application of the target neural network model to image classification as an example.
  • the method for obtaining a neural network model provided by the embodiments of the present application can be applied to other tasks that require computer vision. For example, scenes such as target detection and image segmentation.
  • the target neural network model can also be applied to non-visual tasks.
  • scenes such as natural language processing or speech recognition.
  • the source data set and the target data set can be determined according to the application scenario.
  • the source data set may include the source sample audio signal and the classification label corresponding to the source sample audio signal
  • the target data set may include: the target sample audio signal and the classification corresponding to the target sample audio signal Label.
  • FIG. 13 shows a schematic flowchart of an image processing method 1200 provided by an embodiment of the present application.
  • the method may be executed by a device or device capable of image processing.
  • the method may be executed by a terminal device, a computer, a server, or the like.
  • the target neural network model used in the image processing method 1200 in FIG. 13 may be constructed by the method in FIG. 9 or the method in FIG. 12 described above.
  • the method 1200 includes step S1210 to step S1220.
  • step S1210 to step S1220 For the specific implementation in the method 1200, reference may be made to the aforementioned method 900. To avoid unnecessary repetition, repetitive descriptions are appropriately omitted when the method 1200 is introduced below.
  • the image to be processed may be an image captured by a terminal device (or other device or equipment such as a computer or server) through a camera, or the image to be processed may also be an image from a terminal device (or other device or device such as a computer or server).
  • Device internally obtained images (for example, images stored in an album of the terminal device, or images obtained by the terminal device from the cloud), which is not limited in the embodiment of the present application.
  • S1220 Use the target neural network model to process the image to be processed to obtain a processing result of the image to be processed.
  • the target neural network model is obtained by searching the sub-network model in the super network model.
  • the super network model is obtained by performing migration learning on the pre-trained super network model based on the target data set.
  • the pre-trained super network model is trained based on the source data set. Both the source data set and the target data set are data sets related to the task of image processing.
  • the target neural network model is used to classify the image to be processed, and the classification result is output.
  • the target neural network model is obtained by searching for the sub-network model in the super network model, and the super network model is obtained by migration learning of the pre-trained super network model based on the target data set.
  • the pre-trained super network model is trained based on the source data set.
  • the target data set includes a target sample image and a classification label of the target sample image
  • the source data set includes a source sample image and a classification label of the source sample image.
  • the detailed steps of obtaining the neural network model can be referred to the aforementioned method 900, which will not be repeated here.
  • the above description only takes the application of the target neural network model to image classification as an example.
  • the method for obtaining a neural network model provided by the embodiments of the present application can be applied to other tasks that require computer vision. For example, scenes such as target detection and image segmentation.
  • the target neural network model can also be applied to non-visual tasks.
  • scenes such as natural language processing or speech recognition.
  • the source data set and the target data set can be determined according to the application scenario.
  • the source data set may include the source sample audio signal and the classification label corresponding to the source sample audio signal
  • the target data set may include: the target sample audio signal and the classification corresponding to the target sample audio signal Label.
  • the device in the embodiment of the present application can execute the method of the foregoing embodiment of the present application, that is, for the specific working process of the following various products, reference may be made to the corresponding process in the foregoing method embodiment.
  • FIG. 14 is a schematic block diagram of an apparatus 1300 for obtaining a neural network model provided by an embodiment of the present application. It should be understood that the apparatus 1300 may execute the method of obtaining a neural network model in FIG. 9 or FIG. 12.
  • the apparatus 1300 may be the training device 120 in FIG. 1, or the execution device 310 in FIG. 6, or the system 800 in FIG. 8.
  • the device 1300 includes: an acquiring unit 1310 and a processing unit 1320.
  • the obtaining unit 1310 is used to obtain a pre-trained super network model, which is obtained by training based on a source data set; to obtain a target data set, the task corresponding to the target data set is the same as that of the source data set.
  • the corresponding tasks are the same;
  • the processing unit 1320 is configured to migrate the pre-trained super network model based on the target data set to obtain a super network model after migration learning; search for a sub-network model in the super network model after migration learning , Get the target neural network model.
  • the pre-trained super network model is obtained through progressive contraction training.
  • the processing unit 1320 is specifically configured to: select a sub-network model from the pre-trained super network model, and calculate the weight gradient of the sub-network model based on the target data set, Update the weight of the sub-network model based on the weight gradient of the sub-network model to obtain the updated sub-network model, and obtain the updated super network model based on the updated sub-network model; repeat the above steps until the The updated super network model satisfies the termination condition to obtain the super network model after migration learning; wherein the termination condition includes at least one of the following: the number of repetitions is greater than or equal to the first iteration number; the updated super network The inference accuracy of the model is greater than or equal to the first inference accuracy.
  • the processing unit 1320 is specifically configured to: select N b models from the pre-trained super network model, and calculate the weights of the N b sub-network models based on the target data set Gradient, update the weights of the N b sub-network models based on the weight gradient of the N b sub-network models to obtain the updated N b sub-network models, and obtain the updated super-network models based on the updated N b sub-network models.
  • N b is a positive integer; repeat the above steps until the updated super network model satisfies the termination condition to obtain the super network model after migration learning, wherein the termination condition includes at least one of the following: repeat The number of times is greater than or equal to the first iteration number; the reasoning accuracy of the updated super network model is greater than or equal to the first reasoning accuracy.
  • the processing unit 1320 is specifically configured to: Step 1: Determine n first sub-network models according to the super-network model after transfer learning, where n is an integer greater than 1; Step 2: Adjust the structure of the n first sub-network models to obtain n second sub-network models; Step 3: Select the n-th sub-network model from the n first sub-network models and the n second sub-network models Three sub-network models, using the n third sub-network models as the n first sub-network models in step two; repeat the above steps two to three until the n third sub-network models satisfy the search Termination condition, the search termination condition includes at least one of the following: the number of repetitions is greater than or equal to the number of second iterations, or the inference accuracy of at least p third sub-network models in the n third sub-network models is greater than or It is equal to the target accuracy; the target neural network model is determined according to the n third sub-network models.
  • the processing unit 1320 is specifically configured to: select n fourth sub-network models in the super network model after transfer learning; and obtain the n fourth sub-network models in the target Hardware overhead on the device; adjusting the structure of the n fourth sub-network models based on the hardware overhead to obtain the n first sub-network models.
  • FIG. 15 is a schematic block diagram of an image processing device 1400 according to an embodiment of the present application. It should be understood that the apparatus 1300 may execute the image processing method of FIG. 13.
  • the apparatus 1400 may be the execution device 110 in FIG. 1 or the local device 301 or the execution device 310 in FIG. 6.
  • the device 1400 includes: an acquisition unit 1410 and a processing unit 1420.
  • the acquisition unit 1410 is used to acquire the image to be processed; the processing unit 1420 is used to use the target neural network model to perform image processing on the image to be processed, and output the processing result; wherein, the target neural network model is used in the super network model
  • the super network model is obtained by searching the sub-network model in the middle, the super network model is obtained by migration learning of the pre-trained super network model based on the target data set, the pre-trained super network model is obtained by training based on the source data set, the The task corresponding to the target data set is the same as the task corresponding to the source data set.
  • the pre-trained super network model is obtained through progressive contraction training.
  • the super network model is obtained by performing migration learning on a pre-trained super network model based on a target data set, including: the super network model is obtained from the pre-trained super network model Select a sub-network model, calculate the weight gradient of the sub-network model based on the target data set, update the weight of the sub-network model based on the weight gradient of the sub-network model, and obtain the updated sub-network model based on The updated sub-network model obtains the updated super-network model; repeat the above steps until the updated super-network model meets the termination condition; wherein, the termination condition includes at least one of the following: the number of repetitions is greater than Or equal to the first iteration number; the reasoning accuracy of the updated super network model is greater than or equal to the first reasoning accuracy.
  • the super network model is obtained by performing migration learning on a pre-trained super network model based on a target data set, including: the super network model is obtained from the pre-trained super network model after choosing N b models, weights are calculated gradient of the N b sub-network model based on the target data set, based on the weight of the N b sub-network model weights are updated to the gradient of re-N b sub-network model weights, updated The N b sub-network models of, the updated super network model is obtained based on the updated N b sub-network models, and N b is a positive integer; repeat the above steps until the updated super network model meets the termination condition.
  • the termination condition includes at least one of the following: the number of repetitions is greater than or equal to the first iteration number; the reasoning accuracy of the updated super network model is greater than or equal to the first reasoning accuracy.
  • the target neural network model is obtained by searching for sub-network models in a super network model, including: the target neural network model is determined by determining n first network models according to the super network model.
  • Sub-network model, n is an integer greater than 1; adjust the structure of the n first sub-network models to obtain n second sub-network models; from the n first sub-network models and the n second sub-network models Select n third sub-network models from the sub-network model, and update the n third sub-network models to the n first sub-network models; repeat the above steps until the n third sub-network models satisfy Search termination condition; determined according to the n third sub-network models; wherein, the search termination condition includes at least one of the following: the number of repetitions is greater than or equal to the second iteration number, or the n third sub-networks The inference accuracy of at least p third sub-network models in the model is greater than or equal to the target accuracy.
  • the determining n first sub-network models according to the super network model includes: selecting n fourth sub-network models in the super network model; obtaining the n-th sub-network model; Hardware overhead of the four sub-network models on the target device; adjusting the structure of the n fourth sub-network models based on the hardware overhead to obtain the n first sub-network models.
  • device 1300 and device 1400 are embodied in the form of functional units.
  • unit herein can be implemented in the form of software and/or hardware, which is not specifically limited.
  • a "unit” may be a software program, a hardware circuit, or a combination of the two that realizes the above-mentioned functions.
  • the hardware circuit may include an application specific integrated circuit (ASIC), an electronic circuit, and a processor for executing one or more software or firmware programs (such as a shared processor, a dedicated processor, or a group processor). Etc.) and memory, merged logic circuits and/or other suitable components that support the described functions.
  • the units of the examples described in the embodiments of the present application can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
  • FIG. 16 is a schematic diagram of the hardware structure of an apparatus for obtaining a neural network model provided by an embodiment of the present application.
  • the apparatus 3000 for obtaining a neural network model shown in FIG. 16 includes a memory 3001, a processor 3002, a communication interface 3003, and a bus 3004.
  • the memory 3001, the processor 3002, and the communication interface 3003 implement communication connections between each other through the bus 3004.
  • the apparatus 3000 may be the training device 120 in FIG. 1, or the execution device 310 in FIG. 6, or the system 800 in FIG. 8.
  • the memory 3001 may be a read only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM).
  • the memory 3001 may store a program.
  • the processor 3002 is configured to execute each step of the method for obtaining a neural network model in the embodiment of the present application.
  • the processor 3002 may execute steps S920 to S940 in the method shown in FIG. 9 or steps S1120 to S1150 in the method shown in FIG. 12.
  • the processor 3002 may adopt a general central processing unit (CPU), a microprocessor, an application specific integrated circuit (ASIC), a graphics processing unit (GPU), or one or more
  • the integrated circuit is used to execute related programs to implement the method for obtaining the neural network model in the method embodiment of the present application.
  • the processor 3002 may also be an integrated circuit chip with signal processing capability. For example, it may be the chip shown in FIG. 5.
  • each step of the method for obtaining a neural network model of the present application can be completed by an integrated logic circuit of hardware in the processor 3002 or instructions in the form of software.
  • the above-mentioned processor 3002 may also be a general-purpose processor, a digital signal processing (digital signal processing, DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, Discrete gates or transistor logic devices, discrete hardware components.
  • DSP digital signal processing
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 3001, and the processor 3002 reads the information in the memory 3001, and combines its hardware to complete the functions required by the units included in the apparatus for obtaining neural network models of the embodiments of the present application, or perform the functions of the embodiments of the present application.
  • the method of obtaining the neural network model is performed by the apparatus for obtaining neural network models of the embodiments of the present application, or perform the functions of the embodiments of the present application.
  • the communication interface 3003 uses a transceiver device such as but not limited to a transceiver to implement communication between the device 3000 and other devices or communication networks. For example, a pre-trained super network model or target data set can be obtained through the communication interface 3003.
  • the bus 3004 may include a path for transferring information between various components of the device 3000 (for example, the memory 3001, the processor 3002, and the communication interface 3003).
  • FIG. 17 is a schematic diagram of the hardware structure of an image processing apparatus according to an embodiment of the present application.
  • the image processing device 4000 shown in FIG. 17 includes a memory 4001, a processor 4002, a communication interface 4003, and a bus 4004.
  • the memory 4001, the processor 4002, and the communication interface 4003 implement communication connections between each other through the bus 4004.
  • the apparatus 4000 may be the execution device 110 in FIG. 1 or the local device 301 or the execution device 310 in FIG. 6.
  • the memory 4001 may be ROM, static storage device and RAM.
  • the memory 4001 may store a program.
  • the processor 4002 and the communication interface 4003 are used to execute each step of the image processing method of the embodiment of the present application.
  • the processor 4002 may execute step S1210 to step S1220 in the method shown in FIG. 13 above.
  • the processor 4002 may adopt a general-purpose CPU, a microprocessor, an ASIC, a GPU or one or more integrated circuits to execute related programs to realize the functions required by the units in the image processing apparatus of the embodiments of the present application, or Perform the image processing method of the method embodiment of this application.
  • the processor 4002 may also be an integrated circuit chip with signal processing capability. For example, it may be the chip shown in FIG. 5. In the implementation process, each step of the image processing method of the embodiment of the present application can be completed by an integrated logic circuit of hardware in the processor 4002 or instructions in the form of software.
  • the aforementioned processor 4002 may also be a general-purpose processor, DSP, ASIC, FPGA or other programmable logic device, discrete gate or transistor logic device, or discrete hardware component.
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 4001, and the processor 4002 reads the information in the memory 4001, and combines its hardware to complete the functions required by the units included in the image processing apparatus of the embodiment of the present application, or perform the image processing of the method embodiment of the present application method.
  • the communication interface 4003 uses a transceiver device such as but not limited to a transceiver to implement communication between the device 4000 and other devices or a communication network.
  • a transceiver device such as but not limited to a transceiver to implement communication between the device 4000 and other devices or a communication network.
  • the image to be processed can be acquired through the communication interface 4003.
  • the bus 4004 may include a path for transferring information between various components of the device 4000 (for example, the memory 4001, the processor 4002, and the communication interface 4003).
  • the device 3000 and device 4000 only show a memory, a processor, and a communication interface, in the specific implementation process, those skilled in the art should understand that the device 3000 and device 4000 may also include those necessary for normal operation. Other devices. At the same time, according to specific needs, those skilled in the art should understand that the device 3000 and the device 4000 may also include hardware devices that implement other additional functions. In addition, those skilled in the art should understand that the device 3000 and the device 4000 may also only include the components necessary to implement the embodiments of the present application, and not necessarily include all the components shown in FIG. 16 and FIG. 17.
  • the processor in the embodiment of the present application may be a central processing unit (central processing unit, CPU), and the processor may also be other general-purpose processors, digital signal processors (digital signal processors, DSP), and application-specific integrated circuits. (application specific integrated circuit, ASIC), ready-made programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the memory in the embodiments of the present application may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory can be read-only memory (ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), and electrically available Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • the volatile memory may be random access memory (RAM), which is used as an external cache.
  • RAM random access memory
  • static random access memory static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory
  • Access memory synchronous DRAM, SDRAM
  • double data rate synchronous dynamic random access memory double data rate SDRAM, DDR SDRAM
  • enhanced synchronous dynamic random access memory enhanced SDRAM, ESDRAM
  • synchronous connection dynamic random access memory Take memory (synchlink DRAM, SLDRAM) and direct memory bus random access memory (direct rambus RAM, DR RAM).
  • the above-mentioned embodiments may be implemented in whole or in part by software, hardware, firmware or any other combination.
  • the above-mentioned embodiments may be implemented in the form of a computer program product in whole or in part.
  • the computer program product includes one or more computer instructions or computer programs.
  • the processes or functions described in the embodiments of the present application are generated in whole or in part.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server or data center via wired (such as infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center that includes one or more sets of available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium.
  • the semiconductor medium may be a solid state drive.
  • At least one refers to one or more, and “multiple” refers to two or more.
  • the following at least one item (a)” or similar expressions refers to any combination of these items, including any combination of a single item (a) or a plurality of items (a).
  • at least one item (a) of a, b, or c can mean: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple .
  • the size of the sequence number of the above-mentioned processes does not mean the order of execution, and the execution order of each process should be determined by its function and internal logic, and should not correspond to the embodiments of the present application.
  • the implementation process constitutes any limitation.
  • the disclosed system, device, and method can be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

L'invention porte sur un procédé permettant d'acquérir un modèle de réseau neuronal, ainsi que sur un procédé et sur un appareil de traitement d'image dans le domaine de l'intelligence artificielle. Le procédé permettant d'acquérir un modèle de réseau neuronal consiste : à acquérir un modèle de super-réseau préformé, le modèle de super-réseau préformé étant obtenu par réalisation d'un apprentissage sur la base d'un ensemble de données source ; à acquérir un ensemble de données cible, une tâche correspondant à l'ensemble de données cible étant la même qu'une tâche correspondant à l'ensemble de données source ; à effectuer un apprentissage par transfert sur le modèle de super-réseau préformé sur la base de l'ensemble de données cible de sorte à obtenir un modèle de super-réseau après l'apprentissage par transfert ; et à rechercher un modèle de sous-réseau dans le modèle de super-réseau après l'apprentissage par transfert de sorte à obtenir un modèle de réseau neuronal cible. Au moyen du procédé de la présente invention, le coût d'apprentissage peut être réduit pendant le processus d'obtention du modèle de réseau neuronal requis et les performances du modèle de réseau neuronal peuvent être améliorées.
PCT/CN2021/083371 2020-04-29 2021-03-26 Procédé permettant d'acquérir un modèle de réseau neuronal et procédé et appareil de traitement d'image WO2021218517A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010357935.0A CN113570029A (zh) 2020-04-29 2020-04-29 获取神经网络模型的方法、图像处理方法及装置
CN202010357935.0 2020-04-29

Publications (1)

Publication Number Publication Date
WO2021218517A1 true WO2021218517A1 (fr) 2021-11-04

Family

ID=78158592

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/083371 WO2021218517A1 (fr) 2020-04-29 2021-03-26 Procédé permettant d'acquérir un modèle de réseau neuronal et procédé et appareil de traitement d'image

Country Status (2)

Country Link
CN (1) CN113570029A (fr)
WO (1) WO2021218517A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114021713A (zh) * 2021-11-10 2022-02-08 北京邮电大学 一种基于神经元级别迁移学习的光路传输质量估计方法
CN114239754A (zh) * 2022-02-24 2022-03-25 中国科学院自动化研究所 基于属性特征学习解耦的行人属性识别方法及系统
CN114494815A (zh) * 2022-01-27 2022-05-13 北京百度网讯科技有限公司 神经网络训练方法、目标检测方法、装置、设备和介质
WO2024031986A1 (fr) * 2022-08-12 2024-02-15 华为云计算技术有限公司 Procédé de gestion de modèle et dispositif associé
WO2024094094A1 (fr) * 2022-11-02 2024-05-10 华为技术有限公司 Procédé et appareil d'entraînement de modèle

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114298272A (zh) * 2021-12-23 2022-04-08 安谋科技(中国)有限公司 神经网络模型的构建方法、图像处理方法、设备及介质
CN116563450A (zh) * 2022-01-28 2023-08-08 华为技术有限公司 表情迁移方法、模型训练方法和装置
CN115017377B (zh) * 2022-08-05 2022-11-08 深圳比特微电子科技有限公司 用于搜索目标模型的方法、装置和计算设备
CN115238880B (zh) * 2022-09-21 2022-12-23 山东大学 输电巡检终端的自适应部署方法、系统、设备及存储介质
CN117993436A (zh) * 2022-10-31 2024-05-07 华为技术有限公司 一种模型训练方法和相关设备
CN117709409A (zh) * 2023-05-09 2024-03-15 荣耀终端有限公司 应用于图像处理的神经网络训练方法及相关设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110428046A (zh) * 2019-08-28 2019-11-08 腾讯科技(深圳)有限公司 神经网络结构的获取方法及装置、存储介质
CN110443352A (zh) * 2019-07-12 2019-11-12 阿里巴巴集团控股有限公司 基于迁移学习的半自动神经网络调优方法
CN110533180A (zh) * 2019-07-15 2019-12-03 北京地平线机器人技术研发有限公司 网络结构搜索方法和装置、可读存储介质、电子设备
US20200065689A1 (en) * 2017-07-21 2020-02-27 Google Llc Neural architecture search for convolutional neural networks

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200065689A1 (en) * 2017-07-21 2020-02-27 Google Llc Neural architecture search for convolutional neural networks
CN110443352A (zh) * 2019-07-12 2019-11-12 阿里巴巴集团控股有限公司 基于迁移学习的半自动神经网络调优方法
CN110533180A (zh) * 2019-07-15 2019-12-03 北京地平线机器人技术研发有限公司 网络结构搜索方法和装置、可读存储介质、电子设备
CN110428046A (zh) * 2019-08-28 2019-11-08 腾讯科技(深圳)有限公司 神经网络结构的获取方法及装置、存储介质

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114021713A (zh) * 2021-11-10 2022-02-08 北京邮电大学 一种基于神经元级别迁移学习的光路传输质量估计方法
CN114494815A (zh) * 2022-01-27 2022-05-13 北京百度网讯科技有限公司 神经网络训练方法、目标检测方法、装置、设备和介质
CN114494815B (zh) * 2022-01-27 2024-04-09 北京百度网讯科技有限公司 神经网络训练方法、目标检测方法、装置、设备和介质
CN114239754A (zh) * 2022-02-24 2022-03-25 中国科学院自动化研究所 基于属性特征学习解耦的行人属性识别方法及系统
CN114239754B (zh) * 2022-02-24 2022-05-03 中国科学院自动化研究所 基于属性特征学习解耦的行人属性识别方法及系统
WO2024031986A1 (fr) * 2022-08-12 2024-02-15 华为云计算技术有限公司 Procédé de gestion de modèle et dispositif associé
WO2024094094A1 (fr) * 2022-11-02 2024-05-10 华为技术有限公司 Procédé et appareil d'entraînement de modèle

Also Published As

Publication number Publication date
CN113570029A (zh) 2021-10-29

Similar Documents

Publication Publication Date Title
WO2021218517A1 (fr) Procédé permettant d'acquérir un modèle de réseau neuronal et procédé et appareil de traitement d'image
WO2020221200A1 (fr) Procédé de construction de réseau neuronal, procédé et dispositifs de traitement d'image
WO2021120719A1 (fr) Procédé de mise à jour de modèle de réseau neuronal, procédé et dispositif de traitement d'image
WO2022083536A1 (fr) Procédé et appareil de construction de réseau neuronal
WO2021043193A1 (fr) Procédé de recherche de structures de réseaux neuronaux et procédé et dispositif de traitement d'images
WO2022042713A1 (fr) Procédé d'entraînement d'apprentissage profond et appareil à utiliser dans un dispositif informatique
WO2020216227A9 (fr) Procédé et appareil de classification d'image et procédé et appareil de traitement de données
WO2021057056A1 (fr) Procédé de recherche d'architecture neuronale, procédé et dispositif de traitement d'image, et support de stockage
WO2021238366A1 (fr) Procédé et appareil de construction de réseau neuronal
WO2022001805A1 (fr) Procédé et dispositif de distillation de réseau neuronal
WO2022052601A1 (fr) Procédé d'apprentissage de modèle de réseau neuronal ainsi que procédé et dispositif de traitement d'image
WO2021022521A1 (fr) Procédé de traitement de données et procédé et dispositif d'apprentissage de modèle de réseau neuronal
WO2021008206A1 (fr) Procédé de recherche d'architecture neuronale, et procédé et dispositif de traitement d'images
WO2021244249A1 (fr) Procédé, système et dispositif d'instruction de classificateur et procédé, système et dispositif de traitement de données
WO2021164750A1 (fr) Procédé et appareil de quantification de couche convolutive
WO2022007867A1 (fr) Procédé et dispositif de construction de réseau neuronal
CN113705769A (zh) 一种神经网络训练方法以及装置
CN111783937A (zh) 一种神经网络构建方法以及系统
WO2023231794A1 (fr) Procédé et appareil de quantification de paramètres de réseau neuronal
WO2021129668A1 (fr) Procédé d'apprentissage de réseau neuronal et dispositif
WO2022012668A1 (fr) Procédé et appareil de traitement d'ensemble d'apprentissage
WO2022156475A1 (fr) Procédé et appareil de formation de modèle de réseau neuronal, et procédé et appareil de traitement de données
WO2021136058A1 (fr) Procédé et dispositif de traitement vidéo
CN114492723A (zh) 神经网络模型的训练方法、图像处理方法及装置
WO2023125628A1 (fr) Procédé et appareil d'optimisation de modèle de réseau neuronal et dispositif informatique

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21797430

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21797430

Country of ref document: EP

Kind code of ref document: A1