CN111310684A - Model training method and device, electronic equipment and storage medium - Google Patents

Model training method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111310684A
CN111310684A CN202010114079.6A CN202010114079A CN111310684A CN 111310684 A CN111310684 A CN 111310684A CN 202010114079 A CN202010114079 A CN 202010114079A CN 111310684 A CN111310684 A CN 111310684A
Authority
CN
China
Prior art keywords
neural network
training
network model
model
acceleration engine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010114079.6A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dongsheng Suzhou Intelligent Technology Co ltd
Original Assignee
Dongsheng Suzhou Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dongsheng Suzhou Intelligent Technology Co ltd filed Critical Dongsheng Suzhou Intelligent Technology Co ltd
Priority to CN202010114079.6A priority Critical patent/CN111310684A/en
Publication of CN111310684A publication Critical patent/CN111310684A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a model training method, a model training device, electronic equipment and a storage medium, wherein the method comprises the following steps: optimizing the network structure of a pre-trained first neural network model to obtain an optimized second neural network; and training a second neural network through an inference acceleration engine to obtain a second neural network model. In the implementation process, a network structure of a pre-trained first neural network model is optimized to obtain an optimized second neural network; training a second neural network through an inference acceleration engine to obtain a second neural network model; that is, the time for training the neural network model is reduced by combining the optimization of the model and the use of the inference acceleration engine, so that the speed for training the neural network model is increased, and the problem that the speed for training the neural network model is slower is effectively solved.

Description

Model training method and device, electronic equipment and storage medium
Technical Field
The application relates to the technical field of artificial intelligence, deep learning and transfer learning, in particular to a model training method and device, electronic equipment and a storage medium.
Background
Transfer learning (transfer learning), which is a machine learning method, refers to a process method in which a model constructed for a first task is used as an initial model and the initial model is used to construct a model for a second task; the migration learning specifically includes: the model is used for identifying the automobile, and the model can also be used for improving the capability of identifying the truck, namely the model has acquired the capability of identifying the automobile, and the model can be used for identifying the truck only by slightly adjusting and training the neural network structure of the model.
In the current neural network model training process, in order to optimize the neural network model training, a transfer learning method is usually used to accelerate the neural network model training speed, however, in a specific practice process, it is found that the speed of training the neural network model is relatively slow by using the transfer learning method for the neural network model with a large number of neural network layers or the neural network model with a large number of weight parameters to be trained.
Disclosure of Invention
An object of the embodiments of the present application is to provide a model training method, an apparatus, an electronic device, and a storage medium, which are used to solve the problem that the training speed of a neural network model is slow.
The embodiment of the application provides a model training method, which is applied to electronic equipment and comprises the following steps: optimizing the network structure of a pre-trained first neural network model to obtain an optimized second neural network; and training a second neural network through an inference acceleration engine to obtain a second neural network model. In the implementation process, a network structure of a pre-trained first neural network model is optimized to obtain an optimized second neural network; training a second neural network through an inference acceleration engine to obtain a second neural network model; that is, the time for training the neural network model is reduced by combining the optimization of the model and the use of the inference acceleration engine, so that the speed for training the neural network model is increased, and the problem that the speed for training the neural network model is slower is effectively solved.
Optionally, in an embodiment of the present application, the network structure includes: a weight parameter and a neural network layer; optimizing a network structure of a pre-trained first neural network model, comprising: adjusting a weight parameter in a network structure of the first neural network model; or adjusting the number of neural network layers in the network structure of the first neural network model. In the implementation process, the weight parameters in the network structure of the first neural network model are adjusted; or adjusting the number of neural network layers in the network structure of the first neural network model; the network structure of the neural network model is effectively optimized, so that the training parameters or the number of training layers of the neural network model are reduced, and the training speed of the neural network model is increased.
Optionally, in this embodiment of the present application, training a second neural network by an inference acceleration engine to obtain a second neural network model, includes: obtaining a plurality of training data and a plurality of training labels, wherein the training labels are data labels corresponding to the training data; a second neural network model is obtained by the inference acceleration engine training a second neural network using the plurality of training data and the plurality of training labels. In the implementation process, a second neural network is trained by the inference acceleration engine by using a plurality of training data and a plurality of training labels to obtain a second neural network model; therefore, the calculation capacity of the second neural network model during training is improved, and the speed of training the neural network model is improved.
Optionally, in an embodiment of the present application, training a second neural network using a plurality of training data and a plurality of training labels by an inference acceleration engine, includes: installing drivers for the electronic device and inferring a dependency environment for the acceleration engine; a second neural network model is obtained by the inference acceleration engine training a second neural network using the plurality of training data and the plurality of training labels. In the implementation process, the driver of the electronic device is installed and the environment dependent on the acceleration engine is deduced; training a second neural network by an inference acceleration engine using a plurality of training data and a plurality of training labels to obtain a second neural network model; therefore, the calculation capacity of the second neural network model during training is improved, and the speed of training the neural network model is improved.
Optionally, in this embodiment of the present application, before optimizing the network structure of the pre-trained first neural network model to obtain an optimized second neural network, the method further includes: receiving a first neural network model sent by terminal equipment; after training the second neural network through the inference acceleration engine to obtain a second neural network model, the method further comprises the following steps: and sending the second neural network model to the terminal equipment. In the implementation process, the first neural network model sent by the terminal equipment is received through the electronic equipment; after obtaining the second neural network model, the electronic device also sends the second neural network model to the terminal device; therefore, the speed of the terminal equipment for obtaining the trained second neural network model is improved.
The embodiment of the present application further provides a model training method, applied to a terminal device, including: obtaining a first neural network model; sending the first neural network model to the electronic equipment so that the electronic equipment optimizes the network structure of the first neural network model to obtain an optimized second neural network, training the second neural network through an inference acceleration engine, and obtaining and sending the second neural network model; and receiving a second neural network model transmitted by the electronic device. In the implementation process, the first neural network model is sent to the electronic equipment through the terminal equipment, so that the electronic equipment optimizes the network structure of the first neural network model to obtain an optimized second neural network, and the second neural network is trained through the inference acceleration engine to obtain a second neural network model; after obtaining the second neural network model, the electronic device also sends the second neural network model to the terminal device; therefore, the speed of the terminal equipment for obtaining the trained second neural network model is improved.
The embodiment of the present application further provides a model training device, which is applied to an electronic device, and includes: the model optimization module is used for optimizing the network structure of the pre-trained first neural network model to obtain an optimized second neural network; and the model training module is used for training the second neural network through the inference acceleration engine to obtain a second neural network model. In the implementation process, a network structure of a pre-trained first neural network model is optimized to obtain an optimized second neural network; training a second neural network through an inference acceleration engine to obtain a second neural network model; that is, the time for training the neural network model is reduced by combining the optimization of the model and the use of the inference acceleration engine, so that the speed for training the neural network model is increased, and the problem that the speed for training the neural network model is slower is effectively solved.
Optionally, in an embodiment of the present application, the network structure includes: a weight parameter and a neural network layer; a model optimization module comprising: the structure adjusting module is used for adjusting weight parameters in a network structure of the first neural network model; or adjusting the number of neural network layers in the network structure of the first neural network model.
Optionally, in an embodiment of the present application, the model training module includes: the device comprises a first obtaining module, a second obtaining module and a third obtaining module, wherein the first obtaining module is used for obtaining a plurality of training data and a plurality of training labels, and the training labels are data labels corresponding to the training data; a second obtaining module for obtaining a second neural network model by the inference acceleration engine training a second neural network using the plurality of training data and the plurality of training labels.
Optionally, in an embodiment of the present application, the second obtaining module includes: the environment installation module is used for installing a driver of the electronic equipment and deducing the dependency environment of the acceleration engine; a third obtaining module for obtaining a second neural network model by training the second neural network with the plurality of training data and the plurality of training labels by the inference acceleration engine.
Optionally, in an embodiment of the present application, the model training apparatus further includes: the model sending module is used for receiving a first neural network model sent by the terminal equipment; and the model receiving module is used for sending the second neural network model to the terminal equipment.
The embodiment of the present application further provides a model training device, which is applied to a terminal device, and includes: a network model obtaining module for obtaining a first neural network model; the network model sending module is used for sending the first neural network model to the electronic equipment so that the electronic equipment optimizes the network structure of the first neural network model to obtain an optimized second neural network, and trains the second neural network through the inference acceleration engine to obtain and send the second neural network model; and the network model receiving module is used for receiving the second neural network model sent by the electronic equipment.
An embodiment of the present application further provides an electronic device, including: a processor and a memory, the memory storing processor-executable machine-readable instructions, the machine-readable instructions when executed by the processor performing the method as described above.
Embodiments of the present application also provide a storage medium having a computer program stored thereon, where the computer program is executed by a processor to perform the method as described above.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
FIG. 1 is a schematic flow chart diagram illustrating a model training method provided in an embodiment of the present application;
fig. 2 is a schematic flowchart illustrating interaction between an electronic device and a terminal device according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a model training apparatus provided in an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device provided in an embodiment of the present application.
Detailed Description
The technical solution in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
Before introducing the model training method provided by the embodiment of the present application, some concepts related to the embodiment of the present application are introduced, and some concepts related to the embodiment of the present application are as follows:
neural Networks (NN) are complex network systems formed by widely interconnecting a large number of simple processing units (called neurons), reflect many basic features of human brain functions, and are highly complex nonlinear dynamical learning systems; the neural network has the capabilities of large-scale parallel, distributed storage and processing, self-organization, self-adaptation and self-learning, and is particularly suitable for processing inaccurate and fuzzy information processing problems which need to consider many factors and conditions simultaneously.
Convolutional Neural Networks (CNNs) are artificial Neural Networks whose artificial neurons can respond to surrounding cells and can perform large-scale image processing. The convolutional neural network includes convolutional layers and pooling layers. The convolutional neural network includes a one-dimensional convolutional neural network, a two-dimensional convolutional neural network, and a three-dimensional convolutional neural network. One-dimensional convolutional neural networks are often applied to data processing of sequence classes; two-dimensional convolutional neural networks are often applied to the recognition of image-like texts; the three-dimensional convolutional neural network is mainly applied to medical image and video data identification.
Convolutional layers (Convolutional Layer), also called Convolutional neural network layers, refer to that each Convolutional Layer in a Convolutional neural network is a calculation unit Layer composed of a plurality of Convolutional units, parameters of each Convolutional unit are obtained through optimization of a back propagation algorithm, and the Convolutional layers are a group of parallel characteristic graphs and are composed by sliding different Convolutional kernels on an input image and running certain operation.
The Pooling Layer (Pooling Layer) is used for performing partition sampling on data and downsampling a large matrix into a small matrix, so that the calculated amount is reduced, and overfitting can be prevented; performing pooling operation on the feature map by using a pooling layer, and delivering an obtained operation result to a correction linear unit for operation; pooling is another important concept in convolutional neural networks, which is actually a form of down-sampling; there are many different forms of non-linear pooling functions.
The full Connected Layer (FC) is a linear operation unit Layer that integrates features in an image feature map passing through a plurality of convolution and pooling layers. The fully-connected layer maps the feature map generated by the convolutional layer into a fixed-length (typically the number of image classes in the input image dataset) feature vector.
A normalized exponential function (Softmax), also known as Softmax classifier, Softmax layer or Softmax function, is in fact a gradient log normalization of a finite discrete probability distribution; in mathematics, in particular in probability theory and related fields, a normalized exponential function, or Softmax function, is a generalization of logistic functions; the normalized exponential function can "compress" a K-dimensional vector z containing arbitrary real numbers into another K-dimensional real vector σ (z) such that each element ranges between (0,1) and the sum of all elements is 1.
A server refers to a device that provides computing services over a network, such as: x86 server and non-x 86 server, non-x 86 server includes: mainframe, minicomputer, and UNIX server. Certainly, in a specific implementation process, the server may specifically select a mainframe or a minicomputer, where the mainframe refers to a dedicated processor that mainly supports a closed and dedicated device for providing Computing service of a UNIX operating system, and that uses Reduced Instruction Set Computing (RISC), single-length fixed-point instruction average execution speed (MIPS), and the like; a mainframe, also known as a mainframe, refers to a device that provides computing services using a dedicated set of processor instructions, an operating system, and application software.
The processor is an integrated circuit chip and has signal processing capacity; classes of processors include, but are not limited to: a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a Video Processing Unit (VPU), or the like; the processor here is specifically for example, depending on the manufacturer and type of model: an english viada (NVIDIA) independent graphics card, an Intel (Intel) integrated graphics card, and an Intel (Intel) video processor, etc.; the GPU can be in a separate display card or an integrated display card.
It should be noted that the model training method provided in the embodiments of the present application may be executed by an electronic device, where the electronic device refers to a device terminal having a function of executing a computer program or the server described above, and the device terminal includes, for example: personal Computers (PCs), tablet PCs, PDAs, Mobile Internet Devices (MIDs), devices with intelligent computing chips or Industrial computers (IPCs), etc.; the industrial personal computer is also called an industrial control computer and is tool equipment which adopts a bus structure and detects and controls production processes, electromechanical equipment and process equipment.
Before introducing the model training method provided in the embodiment of the present application, an application scenario applicable to the model training method is introduced, where the application scenario includes, but is not limited to: optimizing the process of model training by using the model training method, or improving the speed of model training by using the model training method, and the like.
Please refer to fig. 1, which is a schematic flow chart of a model training method provided in the embodiment of the present application; the model training method can be applied to the electronic equipment, and the method can comprise the following steps:
step S110: the electronic equipment optimizes the network structure of the pre-trained first neural network model to obtain an optimized second neural network.
The neural network model refers to a neural network model obtained by training an untrained neural network by using preset training data, where the preset training data may be set according to specific actual conditions, for example: in the task of image recognition, the preset training data refers to the image to be recognized, and in the process of supervised learning training, a correct label needs to be set for the training data. A typical artificial neural network model generally includes: network Architecture (Architecture), stimulus functions (Activation Rule), and Learning rules (Learning Rule).
The first neural network model refers to a pre-trained neural network model, where the first neural network model may be a convolutional neural network model, and in a specific practical process, the convolutional neural network model is commonly used, for example: DenseNet, LeNet, AlexNet, VGG, GoogLeNet, and ResNet, among others.
The network structure refers to the variables in the neural network model and the topological relations of the variables, for example, the variables in the neural network may be weights (weights) of neuron connections and excitation values (activities of the neurons). Of course, it can also be understood as a specific constituent or a component in the neural network model described above; specific examples thereof include: the network structure of the first neural network model includes: a weight parameter and a neural network layer; most Neural Network models are composed of Artificial Neural Networks (ANNs), wherein ANNs are mathematical models or calculation models simulating the structure and function of biological Neural networks (such as the central nervous system of animals, and can be the brain) and are used for estimating or approximating functions in the field of machine learning and cognitive science; the neural network here is computed from a large number of artificial neuron connections.
The above-mentioned embodiment of optimizing the network structure of the pre-trained first neural network model in step S110 includes:
step S111: the weight parameters in the network structure of the first neural network model are adjusted.
The weight parameter is used for representing the connection strength between the neurons in the neural network model; the weight parameters may be changed during the process of training the neural network model by using the training data, and of course, the weight parameters of the neural network may also be initialized when the neural network is constructed, for example: randomly initializing the weight parameters of the neural network, namely randomly initializing the neural network.
The above-mentioned embodiments of adjusting the weight parameters in the network structure of the first neural network model include: adding transfer learning in the forward and backward propagation process of neural network training, and adjusting the hyper-parameters in real time by matching with fine-tuning (fine-tuning) to realize better performance; the fine adjustment is used for adjusting the classification number of the softmax classifier; specific examples thereof include: the original neural network can classify 2 kinds of images, and 1 new classification needs to be added, so that the network can classify 3 kinds of images; most parameters of previous training can be reserved for fine tuning, so that the effect of fast training convergence is achieved; specific examples thereof include: and (4) reserving each convolution layer, and only reconstructing the full connection layer and the softmax layer after the convolution layer. By replacing the feature of a specific class by fine-tuning (fine-tuning), efficient transfer learning can be realized without increasing the amount of data. When a new training model is migrated, new data is not easy to overfit, so that the overall generalization capability of the model is improved, and the robustness of the model can be improved.
Or the above-mentioned embodiment of optimizing the network structure of the pre-trained first neural network model in step S110 includes:
step S112: the number of neural network layers in the network structure of the first neural network model is adjusted.
The embodiment of step S112 described above is, for example: if the network structure of the first neural network model comprises four neural network layers, the four neural network layers are a first convolution layer, a second convolution layer, a pooling layer and a full-connection layer respectively; adjusting the number of neural network layers can be divided into two cases: in the first case, a neural network layer is added, and if a third convolutional layer is added to the first neural network model, the adjusted first neural network model includes: the first convolution layer, the second convolution layer, the third convolution layer, the pooling layer and the full-connection layer; in a second case, where the neural network layer is reduced, if a second convolutional layer is reduced from the first neural network model, the modified first neural network model comprises: a first convolutional layer, a pooling layer, and a full-link layer.
In the implementation process, the weight parameters in the network structure of the first neural network model are adjusted; or adjusting the number of neural network layers in the network structure of the first neural network model; the network structure of the neural network model is effectively optimized, so that the training parameters or the number of training layers of the neural network model are reduced, and the training speed of the neural network model is increased.
After step S110, step S120 is performed: the electronic device trains a second neural network through the inference acceleration engine to obtain a second neural network model.
Inference acceleration engine, which refers to a computing service used in conjunction with a hardware device and provided by an inference acceleration engine for a neural network, includes: TensrT based on an English WEIDIA independent graphics card, OpenVINO based on an Intel integrated graphics card, and High Density Deep Learning (HDDL) framework based on Intel multiple video processors.
The above embodiment of training the second neural network by the inference acceleration engine in step S120 may include the following steps:
step S121: a plurality of training data and a plurality of training labels are obtained, wherein the training labels are data labels corresponding to the training data.
The above-described embodiment of obtaining a plurality of training data and a plurality of training labels is, for example: for convenience of understanding and explanation, the overall compressed packet obtaining manner of the plurality of training data and the plurality of training labels includes: the first mode is that a pre-stored integral compression package is obtained, the integral compression package is obtained from a file system, or the integral compression package is obtained from a database; in the second mode, the whole compressed packet is received and obtained from other terminal equipment; in the third mode, the whole compressed package on the internet is obtained by using software such as a browser, or the whole compressed package is obtained by accessing the internet by using other application programs.
Step S122: a second neural network model is obtained by the inference acceleration engine training a second neural network using the plurality of training data and the plurality of training labels.
The embodiment of step S122 described above is, for example: firstly, identifying the type of a processor of the electronic equipment, and then installing a driver of the processor of the electronic equipment and deducing the dependency environment of an acceleration engine according to the type of the processor; finally, training a second neural network by using a plurality of training data and a plurality of training labels through an inference acceleration engine to obtain a second neural network model; the inference acceleration engine herein includes: TensorRT, OpenVINO, HDDL, and the like. In the implementation process, a second neural network is trained by the inference acceleration engine by using a plurality of training data and a plurality of training labels to obtain a second neural network model; therefore, the calculation capacity of the second neural network model during training is improved, and the speed of training the neural network model is improved.
The above-described environment-dependent implementation of installing a driver of a processor of an electronic device and an inference acceleration engine according to the type of the processor is, for example: if the processor of the electronic equipment is the Inviida independent display card, a driver of the Inviida independent display card and the dependency environment of the tensorrT can be installed; if the processor of the electronic equipment is an Intel integrated graphics card, a driver of the Intel integrated graphics card and the dependency environment of OpenVINO can be installed, a model structure generated by using OpenVINO can be loaded on various heterogeneous computing chips, namely, a neural network model generated by using OpenVINO can run on a CPU, a GPU and/or a VPU, so that data calculation is accelerated, the time consumed in the training and detection processes is greatly reduced, software can be freely switched on different hardware and different computing chips, and calculation processing across hardware platforms is realized. If the processor of the electronic device is an intel video processor, then the drivers for the intel video processor and the dependency environment of the HDDL can be installed.
It will be appreciated that in the course of particular practice, the inference acceleration engine may be employed to process a variety of digital signal data, including: video data, image data, sound signals, or audio data, etc.; inference acceleration engines herein include, but are not limited to: TensrT on an Ingland (NVIDIA) independent graphics card, OpenVINO on an Intel (Intel) integrated graphics card, and HDDL frameworks on Intel (Intel) multiple video processors.
After the optimized and inferred second neural network model is deployed to the electronic equipment, in the process of training the second neural network model, parallel heterogeneous acceleration is carried out by utilizing computing chips, namely, various hardware acceleration chips such as a CPU, a VPU (virtual peripheral Unit) and a programmable Gate Array (FPGA) can be used, the hardware acceleration chips are processed simultaneously to a certain extent, and the utilization rate of each computing chip is maximized so as to improve the efficiency of model training. That is, the neural network model is trained in processors of various hardware platforms, so that the model training method can be deployed in a wider variety of hardware products, specifically for example: the second neural network model described above is run on a Programmable Gate Array (FPGA) Array 10. Meanwhile, the utilization rate of each computing chip can be improved by using the method, so that the resources are reasonably distributed, the data processing of the neural network is evenly distributed on different chips, and the overall performance is improved. In the implementation process, the driver of the electronic device is installed and the environment dependent on the acceleration engine is deduced; training a second neural network by an inference acceleration engine using a plurality of training data and a plurality of training labels to obtain a second neural network model; therefore, the calculation capacity of the second neural network model during training is improved, and the speed of training the neural network model is improved.
Of course, in a specific model training process, in order to prevent the over-fitting phenomenon of the training result, a regularization term may also be added to the neural network model; regularization here means that in linear algebraic theory, an ill-posed problem is usually defined by a set of linear algebraic equations. Meanwhile, a drop (Dropout) layer can be used in the hidden layer to improve the fault tolerance of the model, wherein Dropout is a regularization technology used for resisting over-fitting in an artificial neural network, and the Dropout can avoid complex mutual adaptation on training data; specific examples thereof include: in the forward and backward propagation process, a large number of neurons do not have substantial significance, the calculated amount of the neurons is greatly reduced for simplifying the topological structure of the model, and a certain proportion of hidden node neurons can be temporarily discarded according to the characteristics randomly during each iteration of the model; by adding a regularization term in the neural network model or using a Dropout layer in the hidden layer, the neural network will not depend on the weight of the neuron completely, and the over-fitting phenomenon can be effectively prevented.
In a specific practical process, the second neural network model obtained through the optimization can be converted into a file in an ONNX format, and then the inference acceleration engine is used for loading and training the second neural network model in the ONNX format, specifically for example: performing compression optimization processing on the generated model through a runtime engine, or performing Intermediate Representation (IR) deep learning Inference on the second neural network model in the ONNX format by using OpenVINO, namely inputting the second neural network model in the ONNX format into a deep learning Inference engine (Inference engine) by using OpenVINO to perform IR conversion so as to realize optimization and compression of the model, wherein the Intermediate Representation deep learning Inference can be understood as a process of slimming the second neural network model in the ONNX format, such as reducing weight parameters or the number of neural network layers and the like described above; wherein, ONNX (English: Open Neural Network Exchange) is an Open file format designed for machine learning and used for storing a trained model; ONNX enables different artificial intelligence frameworks (such as Pythrch and MXNet) to adopt the same format to store model data and interact; the deep learning framework that currently supports loading and reasoning of the ONNX model is, for example: caffe2, PyTorch, MXNet, ml. net, TensorRT, and Microsoft CNTK, among others.
In the implementation process, a network structure of a pre-trained first neural network model is optimized to obtain an optimized second neural network; training a second neural network through an inference acceleration engine to obtain a second neural network model; that is, the time for training the neural network model is reduced by combining the optimization of the model and the use of the inference acceleration engine, so that the speed for training the neural network model is increased, and the problem that the speed for training the neural network model is slower is effectively solved.
Please refer to a schematic flow chart of interaction between an electronic device and a terminal device provided in an embodiment of the present application shown in fig. 2; optionally, in this embodiment of the present application, the electronic device executing the model training method may further interact with a terminal device, and then the model training method may further include the following steps:
step S210: the terminal device obtains a first neural network model.
The embodiment of the terminal device obtaining the first neural network model includes: in the first mode, a first neural network model stored in advance is obtained, the first neural network model is obtained from a file system, or the first neural network model is obtained from a database; in a second mode, a first neural network model is received and obtained from other terminal equipment; in the third mode, a first neural network model on the internet is acquired by using software such as a browser, or the first neural network model is acquired by accessing the internet by using other application programs.
Step S220: the terminal device sends the first neural network model to the electronic device.
The above-mentioned embodiment in which the terminal device sends the first neural network model to the electronic device is, for example: the terminal device sends the first neural network model to the electronic device through a Transmission Control Protocol (TCP) or a User Datagram Protocol (UDP). The TCP protocol is also called a network communication protocol, and is a connection-oriented, reliable and byte stream-based transport layer communication protocol; in the Internet protocol suite (Internet protocol suite), the TCP layer is an intermediate layer located above the IP layer and below the application layer; reliable, pipe-like connections are often required between the application layers of different hosts, but the IP layer does not provide such a flow mechanism, but rather provides unreliable packet switching. The UDP Protocol is a short for User Datagram Protocol, a Chinese name is a User Datagram Protocol, and the UDP Protocol is a connectionless transport layer Protocol in an Open System Interconnection (OSI) reference model, and provides a transaction-oriented simple unreliable information transfer service.
Step S230: the electronic equipment receives the first neural network model sent by the terminal equipment, optimizes the network structure of the first neural network model trained in advance, and obtains an optimized second neural network.
The embodiment of the first neural network model sent by the receiving terminal device of the electronic device is as follows: the electronic device receives the first neural network model transmitted by the terminal device through a TCP protocol or a UDP protocol, and it can be understood that specific receiving and transmitting protocols can be set according to specific situations.
The implementation principle and implementation manner of this step are similar or similar to those of step S110, and therefore, the implementation manner and implementation principle of this step are not described here, and if it is not clear, reference may be made to the description of step S110.
Step S240: the electronic device trains a second neural network through the inference acceleration engine to obtain a second neural network model.
The implementation principle and implementation manner of this step are similar or similar to those of step S120, and therefore, the implementation manner and implementation principle of this step are not described here, and if it is not clear, reference may be made to the description of step S120.
Step S250: the electronic device sends the second neural network model to the terminal device.
The implementation principle and implementation manner of this step are similar or analogous to those of step S220, and therefore, the implementation manner and implementation principle of this step are not described here, and if it is not clear, reference may be made to the description of step S220.
Step S260: and the terminal equipment receives the second neural network model transmitted by the electronic equipment.
The embodiment of the second neural network model transmitted by the terminal device receiving electronic device is as follows: the method comprises the steps that terminal equipment receives a second neural network model sent by electronic equipment through a Hyper Text Transfer Protocol (HTTP); the HTTP protocol here is a simple request-response protocol, which typically runs on top of the TCP protocol, which specifies what messages a client may send to a server and what responses it gets. In the implementation process, the first neural network model sent by the terminal equipment is received through the electronic equipment; after obtaining the second neural network model, the electronic device also sends the second neural network model to the terminal device; therefore, the speed of the terminal equipment for obtaining the trained second neural network model is improved.
In the implementation process, the first neural network model is sent to the electronic equipment through the terminal equipment, so that the electronic equipment optimizes the network structure of the first neural network model to obtain an optimized second neural network, and the second neural network is trained through the inference acceleration engine to obtain a second neural network model; after obtaining the second neural network model, the electronic device also sends the second neural network model to the terminal device; therefore, the speed of the terminal equipment for obtaining the trained second neural network model is improved.
Please refer to fig. 3, which is a schematic diagram of a model training apparatus provided in the embodiment of the present application; the embodiment of the present application provides a model training apparatus 300, which is applied to an electronic device, and includes:
and a model optimization module 310, configured to optimize a network structure of the pre-trained first neural network model to obtain an optimized second neural network.
And the model training module 320 is used for training the second neural network through the inference acceleration engine to obtain a second neural network model.
Optionally, in an embodiment of the present application, the network structure includes: a weight parameter and a neural network layer; a model optimization module comprising:
the structure adjusting module is used for adjusting weight parameters in a network structure of the first neural network model; or adjusting the number of neural network layers in the network structure of the first neural network model.
Optionally, in an embodiment of the present application, the model training module includes:
the first obtaining module is used for obtaining a plurality of training data and a plurality of training labels, wherein the training labels are data labels corresponding to the training data.
A second obtaining module for obtaining a second neural network model by the inference acceleration engine training a second neural network using the plurality of training data and the plurality of training labels.
Optionally, in an embodiment of the present application, the second obtaining module includes:
and the environment installation module is used for installing a driver of the electronic equipment and deducing the dependent environment of the acceleration engine.
A third obtaining module for obtaining a second neural network model by training the second neural network with the plurality of training data and the plurality of training labels by the inference acceleration engine.
Optionally, in an embodiment of the present application, the model training apparatus further includes:
and the model sending module is used for receiving the first neural network model sent by the terminal equipment.
And the model receiving module is used for sending the second neural network model to the terminal equipment.
The embodiment of the present application further provides a model training device, which is applied to a terminal device, and the model training device includes:
and the network model obtaining module is used for obtaining the first neural network model.
And the network model sending module is used for sending the first neural network model to the electronic equipment so that the electronic equipment optimizes the network structure of the first neural network model to obtain an optimized second neural network, trains the second neural network through the inference acceleration engine, and obtains and sends the second neural network model.
And the network model receiving module is used for receiving the second neural network model sent by the electronic equipment.
It should be understood that the apparatus corresponds to the above-mentioned embodiment of the model training method, and can perform the steps related to the above-mentioned embodiment of the method, and the specific functions of the apparatus can be referred to the above description, and the detailed description is appropriately omitted here to avoid redundancy. The device includes at least one software function that can be stored in memory in the form of software or firmware (firmware) or solidified in the Operating System (OS) of the device.
Please refer to fig. 4 for a schematic structural diagram of an electronic device according to an embodiment of the present application. An electronic device 400 provided in an embodiment of the present application includes: a processor 410 and a memory 420, the memory 420 storing machine-readable instructions executable by the processor 410, the machine-readable instructions when executed by the processor 410 performing the method as above.
The embodiment of the present application further provides a storage medium 430, where the storage medium 430 stores a computer program, and the computer program is executed by the processor 410 to perform the above model training method.
The storage medium 430 may be implemented by any type of volatile or nonvolatile storage device or combination thereof, such as a Static Random Access Memory (SRAM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), an Erasable Programmable Read-Only Memory (EPROM), a Programmable Read-Only Memory (PROM), a Read-Only Memory (ROM), a magnetic Memory, a flash Memory, a magnetic disk, or an optical disk.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.
The above description is only an alternative embodiment of the embodiments of the present application, but the scope of the embodiments of the present application is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the embodiments of the present application, and all the changes or substitutions should be covered by the scope of the embodiments of the present application.

Claims (10)

1. A model training method is applied to electronic equipment and comprises the following steps:
optimizing the network structure of a pre-trained first neural network model to obtain an optimized second neural network;
and training the second neural network through an inference acceleration engine to obtain a second neural network model.
2. The method of claim 1, wherein the network structure comprises: a weight parameter and a neural network layer; the optimizing the network structure of the pre-trained first neural network model includes:
adjusting a weight parameter in a network structure of the first neural network model; or
Adjusting a number of neural network layers in a network structure of the first neural network model.
3. The method of claim 1, wherein training the second neural network by the inference acceleration engine to obtain a second neural network model comprises:
obtaining a plurality of training data and a plurality of training labels, wherein the training labels are data labels corresponding to the training data;
a second neural network model is obtained by the inference acceleration engine training the second neural network using the plurality of training data and the plurality of training labels.
4. The method of claim 3, wherein the training the second neural network using the plurality of training data and the plurality of training labels by the inference acceleration engine comprises:
installing a driver of the electronic device and a dependent environment of the inference acceleration engine;
training, by the inference acceleration engine, the second neural network using the plurality of training data and the plurality of training labels, obtaining a second neural network model.
5. The method of claim 1, further comprising, before the optimizing the network structure of the pre-trained first neural network model to obtain an optimized second neural network:
receiving a first neural network model sent by terminal equipment;
after the training the second neural network by the inference acceleration engine to obtain a second neural network model, further comprising:
and sending a second neural network model to the terminal equipment.
6. A model training method is applied to terminal equipment and comprises the following steps:
obtaining a first neural network model;
sending the first neural network model to electronic equipment so that the electronic equipment optimizes the network structure of the first neural network model to obtain an optimized second neural network, training the second neural network through an inference acceleration engine, and obtaining and sending a second neural network model;
receiving the second neural network model transmitted by the electronic device.
7. A model training device applied to an electronic device includes:
the model optimization module is used for optimizing the network structure of the pre-trained first neural network model to obtain an optimized second neural network;
and the model training module is used for training the second neural network through the inference acceleration engine to obtain a second neural network model.
8. A model training device is characterized by being applied to a terminal device and comprising:
a network model obtaining module for obtaining a first neural network model;
the network model sending module is used for sending the first neural network model to the electronic equipment so that the electronic equipment optimizes the network structure of the first neural network model to obtain an optimized second neural network, and training the second neural network through an inference acceleration engine to obtain and send a second neural network model;
and the network model receiving module is used for receiving the second neural network model sent by the electronic equipment.
9. An electronic device, comprising: a processor and a memory, the memory storing machine-readable instructions executable by the processor, the machine-readable instructions, when executed by the processor, performing the method of any of claims 1 to 5.
10. A storage medium, characterized in that the storage medium has stored thereon a computer program which, when executed by a processor, performs the method according to any one of claims 1 to 6.
CN202010114079.6A 2020-02-24 2020-02-24 Model training method and device, electronic equipment and storage medium Pending CN111310684A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010114079.6A CN111310684A (en) 2020-02-24 2020-02-24 Model training method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010114079.6A CN111310684A (en) 2020-02-24 2020-02-24 Model training method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111310684A true CN111310684A (en) 2020-06-19

Family

ID=71161900

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010114079.6A Pending CN111310684A (en) 2020-02-24 2020-02-24 Model training method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111310684A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738436A (en) * 2020-06-28 2020-10-02 电子科技大学中山学院 Model distillation method and device, electronic equipment and storage medium
CN111931926A (en) * 2020-10-12 2020-11-13 南京风兴科技有限公司 Hardware acceleration system and control method for convolutional neural network CNN
CN111950630A (en) * 2020-08-12 2020-11-17 深圳市烨嘉为技术有限公司 Small sample industrial product defect classification method based on two-stage transfer learning
CN112149551A (en) * 2020-09-21 2020-12-29 上海孚聪信息科技有限公司 Safety helmet identification method based on embedded equipment and deep learning
CN112329997A (en) * 2020-10-26 2021-02-05 国网河北省电力有限公司雄安新区供电公司 Power demand load prediction method and system, electronic device, and storage medium
CN112735083A (en) * 2021-01-19 2021-04-30 齐鲁工业大学 Embedded gateway for flame detection by using YOLOv5 and OpenVINO and deployment method thereof
CN113128670A (en) * 2021-04-09 2021-07-16 南京大学 Neural network model optimization method and device
WO2023040740A1 (en) * 2021-09-18 2023-03-23 华为技术有限公司 Method for optimizing neural network model, and related device

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738436A (en) * 2020-06-28 2020-10-02 电子科技大学中山学院 Model distillation method and device, electronic equipment and storage medium
CN111738436B (en) * 2020-06-28 2023-07-18 电子科技大学中山学院 Model distillation method and device, electronic equipment and storage medium
CN111950630A (en) * 2020-08-12 2020-11-17 深圳市烨嘉为技术有限公司 Small sample industrial product defect classification method based on two-stage transfer learning
CN111950630B (en) * 2020-08-12 2022-08-02 深圳市烨嘉为技术有限公司 Small sample industrial product defect classification method based on two-stage transfer learning
CN112149551A (en) * 2020-09-21 2020-12-29 上海孚聪信息科技有限公司 Safety helmet identification method based on embedded equipment and deep learning
CN111931926A (en) * 2020-10-12 2020-11-13 南京风兴科技有限公司 Hardware acceleration system and control method for convolutional neural network CNN
CN112329997A (en) * 2020-10-26 2021-02-05 国网河北省电力有限公司雄安新区供电公司 Power demand load prediction method and system, electronic device, and storage medium
CN112735083A (en) * 2021-01-19 2021-04-30 齐鲁工业大学 Embedded gateway for flame detection by using YOLOv5 and OpenVINO and deployment method thereof
CN113128670A (en) * 2021-04-09 2021-07-16 南京大学 Neural network model optimization method and device
CN113128670B (en) * 2021-04-09 2024-03-19 南京大学 Neural network model optimization method and device
WO2023040740A1 (en) * 2021-09-18 2023-03-23 华为技术有限公司 Method for optimizing neural network model, and related device

Similar Documents

Publication Publication Date Title
CN111310684A (en) Model training method and device, electronic equipment and storage medium
US11790212B2 (en) Quantization-aware neural architecture search
US20220108157A1 (en) Hardware architecture for introducing activation sparsity in neural network
CN111191791A (en) Application method, training method, device, equipment and medium of machine learning model
CN110309847B (en) Model compression method and device
US11604960B2 (en) Differential bit width neural architecture search
WO2022179492A1 (en) Pruning processing method for convolutional neural network, data processing method and devices
CN111523640B (en) Training method and device for neural network model
US20210117786A1 (en) Neural networks for scalable continual learning in domains with sequentially learned tasks
WO2014060001A1 (en) Multitransmitter model of the neural network with an internal feedback
CN111222046B (en) Service configuration method, client for service configuration, equipment and electronic equipment
US20210383205A1 (en) Taxonomy Construction via Graph-Based Cross-domain Knowledge Transfer
US20200302283A1 (en) Mixed precision training of an artificial neural network
CN112131578A (en) Method and device for training attack information prediction model, electronic equipment and storage medium
CN115238909A (en) Data value evaluation method based on federal learning and related equipment thereof
WO2024094094A1 (en) Model training method and apparatus
US20200192797A1 (en) Caching data in artificial neural network computations
KR20220139248A (en) Neural network layer folding
CN114707643A (en) Model segmentation method and related equipment thereof
Shan et al. DRAC: a delta recurrent neural network-based arithmetic coding algorithm for edge computing
WO2023287392A1 (en) Systems and methods for federated learning of machine-learned models with sampled softmax
CN115879524A (en) Model training method and related equipment thereof
CN111709785B (en) Method, apparatus, device and medium for determining user retention time
US20240152799A1 (en) Generative graph modeling framework
WO2022171027A1 (en) Model training method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination