CN115271045A

CN115271045A - Neural network model optimization method and system based on machine learning

Info

Publication number: CN115271045A
Application number: CN202210959383.XA
Authority: CN
Inventors: 刘欢
Original assignee: Zhengzhou University
Current assignee: Zhengzhou University
Priority date: 2022-08-10
Filing date: 2022-08-10
Publication date: 2022-11-01

Abstract

The application provides a neural network model optimization method and system based on machine learning, and relates to the field of machine learning. A neural network model optimization method based on machine learning comprises the following steps: acquiring the structure and parameters of a neural network model to be optimized, weight parameters corresponding to each layer and the number of neurons of each layer; adding a virtual layer on the upper layer of a target optimization layer of the neural network model to be optimized, and converting the neural network model to be optimized, which is added into the virtual layer, into an initial model; and fusing the initial model with the structure and parameters of the neural network model to be optimized, the weight parameters corresponding to each layer and the number of neurons of each layer, outputting a combined optimization strategy, and optimizing the neural network model to be optimized by using the combined optimization strategy. The method can improve the recognition accuracy or classification accuracy in image recognition, and simultaneously solves the problem that the layout of a neural network model on a hardware accelerator is limited.

Description

Neural network model optimization method and system based on machine learning

Technical Field

The application relates to the field of machine learning, in particular to a neural network model optimization method and system based on machine learning.

Background

Machine learning is very important and has wide application in the field of artificial intelligence, such as image recognition, image classification, and the like. One key component in machine learning is the optimization of a loss function, which is a function used to measure the gap between the output of a machine learning model and the actual output.

Neural network models, such as Deep Neural Networks (DNN) models, have been widely used in many fields due to their strong ability to fit data. Generally, the fitting ability of DNN is positively correlated with the feature dimension of its input, the number of hidden layer neurons, and the number of hidden layer layers, which also means that many memory cells are required to store one DNN model.

Taking a DNN model of 7 slices as an example, assuming that the number of neurons in each slice is [1000,2000,1000,500,100,50,1] (assuming that the 1 st slice is an input slice and the last slice is an output slice), the required model parameters are 1000 × 2000+ 1000+ 500+ 100+ 50+ 1= 5504550, and the space for storing the model parameters is 4555050 + 4= 20200 and about 18M of storage space. In fact, in the image and voice fields, the width and depth of the needed neurons are often ten times or hundred times of the example, which also means that the online system needs more memory units to store the model.

The neural network model can realize accurate identification of the target object and is based on the calculation complexity of the model height. This requires a hardware accelerator for deploying neural network models that has significant memory and can carry complex computational complexity. However, in practical applications, hardware for deploying the neural network model does not have a very large memory space, which results in that the neural network model cannot be applied to a hardware accelerator with a small memory.

Disclosure of Invention

The application aims to provide a neural network model optimization method based on machine learning, which can improve the recognition accuracy or classification accuracy in image recognition and simultaneously solve the problem that the layout of a neural network model on a hardware accelerator is limited.

Another object of the present application is to provide a machine learning based neural network model optimization system, which can operate a machine learning based neural network model optimization method.

The embodiment of the application is realized as follows:

in a first aspect, an embodiment of the present application provides a neural network model optimization method based on machine learning, which includes obtaining a structure and parameters of a neural network model to be optimized, a weight parameter corresponding to each layer, and a number of neurons in each layer; adding a virtual layer to the upper layer of a target optimization layer of a neural network model to be optimized, and converting the neural network model to be optimized, which is added to the virtual layer, into an initial model; and fusing the initial model with the structure and parameters of the neural network model to be optimized, the weight parameters corresponding to each layer and the number of neurons of each layer, outputting a combined optimization strategy, and optimizing the neural network model to be optimized by using the combined optimization strategy.

In some embodiments of the present application, the obtaining a structure and a parameter of the neural network model to be optimized, a weight parameter corresponding to each layer, and the number of neurons in each layer includes: and setting a learning framework of the neural network model to be optimized, wherein the learning framework comprises a neural network model body to be optimized and an optimizer of the neural network model to be optimized, and the neural network model optimizer to be optimized is used for adjusting the structure and parameters of the neural network body corresponding to the neural network model optimizer, the weight parameters corresponding to each layer and the number of neurons of each layer.

In some embodiments of the present application, the above further includes: and acquiring the product of the number of the neurons of each layer in the neural network model to be optimized and the number of the neurons of the upper layer, and determining the layer which is larger than a preset threshold value in the calculated product as a target optimization layer.

In some embodiments of the present application, adding a virtual layer to an upper layer of the target optimization layer of the neural network model to be optimized, and converting the neural network model to be optimized, which is added to the virtual layer, into the initial model includes: and determining a virtual layer weight scale factor according to the weight coefficient of the virtual layer and the weight range which can be accommodated, and performing weight optimization processing on the virtual layer according to the scale factor and the weight of the virtual layer.

In some embodiments of the present application, the above further includes: and adding a preset normalization layer after the virtual layer subjected to weight optimization processing.

In some embodiments of the present application, the fusing the initial model with the structure and the parameters of the neural network model to be optimized, the weight parameter corresponding to each layer, and the number of neurons in each layer, and outputting a combinatorial optimization strategy, where optimizing the neural network model to be optimized using the combinatorial optimization strategy includes: and arranging the neural network models subjected to fusion processing according to the corresponding output combination optimization strategy arrangement sequence so as to select the neural network model with smaller predicted performance loss from the output combination optimization strategies.

In some embodiments of the present application, the above further includes: and selecting the neural network model ranked first in the ranking sequence of the combined optimization strategies to output, or selecting the weighting results of a plurality of neural network models ranked at the top to output.

In a second aspect, an embodiment of the present application provides a neural network model optimization system based on machine learning, which includes an obtaining module, configured to obtain a structure and parameters of a neural network model to be optimized, a weight parameter corresponding to each layer, and a number of neurons in each layer;

the virtual layer module is used for adding a virtual layer to the upper layer of the target optimization layer of the neural network model to be optimized and converting the neural network model to be optimized, which is added to the virtual layer, into an initial model;

and the optimization module is used for carrying out fusion processing on the initial model in combination with the structure and parameters of the neural network model to be optimized, the weight parameters corresponding to each layer and the number of the neurons of each layer, outputting a combined optimization strategy, and optimizing the neural network model to be optimized by using the combined optimization strategy.

In some embodiments of the present application, the above includes: at least one memory for storing computer instructions; at least one processor in communication with the memory, wherein the at least one processor, when executing the computer instructions, causes the system to: the device comprises an acquisition module, a virtual layer module and an optimization module.

In a third aspect, embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements a method as any one of a machine learning-based neural network model optimization method.

Compared with the prior art, the embodiment of the application has at least the following advantages or beneficial effects:

firstly, fusion processing is carried out on an optimization layer, the calculation dimensionality of a neural network model is reduced, and then weight optimization processing is carried out on the neural network model according to the weight range which can be contained by a target hardware accelerator, so that the target hardware accelerator is adaptive to the neural network model. The method provided by the application solves the problem that the layout of the neural network model on the hardware accelerator is limited on the premise of improving the accuracy of the neural network model.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.

Fig. 1 is a schematic diagram illustrating steps of a neural network model optimization method based on machine learning according to an embodiment of the present application;

fig. 2 is a schematic diagram illustrating detailed steps of a neural network model optimization method based on machine learning according to an embodiment of the present disclosure;

FIG. 3 is a block diagram of a system for optimizing a neural network model based on machine learning according to an embodiment of the present disclosure;

fig. 4 is an electronic device according to an embodiment of the present disclosure.

Icon: 10-an acquisition module; 20-virtual layer module; 30-an optimization module; 101-a memory; 102-a processor; 103-communication interface.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present application, as presented in the figures, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

It is to be noted that the term "comprises," "comprising," or any other variation thereof is intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.

Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and the individual features of the embodiments can be combined with one another without conflict.

Example 1

Referring to fig. 1, fig. 1 is a schematic diagram illustrating steps of a neural network model optimization method based on machine learning according to an embodiment of the present disclosure, as follows:

s100, acquiring the structure and parameters of a neural network model to be optimized, weight parameters corresponding to each layer and the number of neurons of each layer;

in some embodiments, the neural network model to be optimized includes an input layer on the input side, an output layer on the output side, and an intermediate layer between the input layer and the output layer. Each layer takes the previous layer as input and the output results are transmitted to the next layer.

Step S110, adding a virtual layer to the upper layer of the target optimization layer of the neural network model to be optimized, and converting the neural network model to be optimized, which is added to the virtual layer, into an initial model;

in some embodiments, for each layer in the neural network model, a product of the number N of neurons in the layer and the number M of neurons in an upper layer is calculated, and a layer for which the calculated product M × N is greater than a preset threshold, such as 100,000, is determined as the target optimization layer. One or more layers in a neural network model may be determined as target optimization layers.

And step S120, fusing the initial model with the structure and parameters of the neural network model to be optimized, the weight parameters corresponding to each layer and the number of neurons of each layer, outputting a combined optimization strategy, and optimizing the neural network model to be optimized by using the combined optimization strategy.

In some embodiments, after the optimization of the neural network model, if X bit widths are currently used for quantization, a few pictures are used for testing the neural network model, and the neural network model is compared with the neural network model which is not quantized at all, and if the error between the two is large, it indicates that X bit widths are not enough to meet the precision requirement. At this point, the size of X may be increased and the optimization may be re-performed. It should be noted that, compared with the method that performs quantization directly according to the bit width of X bits, the method provided in the embodiment of the present application obviously improves accuracy.

Example 2

Referring to fig. 2, fig. 2 is a detailed step diagram of a neural network model optimization method based on machine learning according to an embodiment of the present application, which is shown as follows:

step S200, a learning framework of the neural network model to be optimized is set, the learning framework comprises a neural network model body to be optimized and an optimizer of the neural network model to be optimized, and the neural network model optimizer to be optimized is used for adjusting the structure and parameters of the neural network body corresponding to the neural network model optimizer, the weight parameters corresponding to each layer and the number of neurons of each layer.

Step S210, obtaining a product of the number of neurons in each layer of the neural network model to be optimized and the number of neurons in an upper layer, and determining a layer greater than a preset threshold value in the calculated product as a target optimization layer.

Step S220, determining a virtual layer weight scale factor according to the weight coefficient of the virtual layer and the acceptable weight range, and performing weight optimization processing on the virtual layer according to the scale factor and the weight of the virtual layer.

In step S230, after the weight optimization processing is performed on the virtual layer, a preset normalization layer is added.

Step S240, arranging the neural network models after the fusion processing according to the corresponding output combination optimization strategy arrangement sequence, so as to select the neural network model with smaller predicted performance loss from the output combination optimization strategies.

And step S250, selecting the neural network model ranked first in the ranking of the combination optimization strategy ranking sequence to output, or selecting the weighting results of a plurality of neural network models ranked at the top to output.

In some embodiments, the optimizer of the neural network model continuously optimizes the neural network model along with the machine learning task, so that parameters of the neural network model are continuously changed, and parameters corresponding to the neural network model ontology and the neural network model ontology at each specific moment are fixed values, and the parameters corresponding to the neural network model and the neural network model form a neural network, the characteristics of the neural network are actual meanings of the characteristics of the learning framework, and the characteristics include not only state characteristics of the neural network but also quality parameters thereof.

The state characteristics of the neural network model may include loss function information of the neural network model, or a statistical indicator of the loss function values of the current input samples. The general loss function may be defined as a squared loss function, an exponential loss function, a negative likelihood function, and so on. The state features may also include statistical indicators related to node output values of the respective neural networks. The difference between the current gradient information and the gradient information of the previous stage can also be included.

Example 3

Referring to fig. 3, fig. 3 is a schematic diagram of a neural network model optimization system based on machine learning according to an embodiment of the present application, which is shown as follows:

an obtaining module 10, configured to obtain a structure and parameters of a neural network model to be optimized, a weight parameter corresponding to each layer, and the number of neurons in each layer;

the virtual layer module 20 is configured to add a virtual layer to an upper layer of a target optimization layer of a neural network model to be optimized, and convert the neural network model to be optimized, which is added to the virtual layer, into an initial model;

and the optimization module 30 is configured to perform fusion processing on the initial model in combination with the structure and parameters of the neural network model to be optimized, the weight parameter corresponding to each layer, and the number of neurons in each layer, output a combinatorial optimization strategy, and optimize the neural network model to be optimized by using the combinatorial optimization strategy.

As shown in fig. 4, an embodiment of the present application provides an electronic device, which includes a memory 101 for storing one or more programs; a processor 102. The one or more programs, when executed by the processor 102, implement the method of any of the first aspects as described above.

Also included is a communication interface 103, with the memory 101, processor 102, and communication interface 103 being electrically connected to each other, directly or indirectly, to enable transfer or interaction of data. For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The memory 101 may be used to store software programs and modules, and the processor 102 executes the software programs and modules stored in the memory 101 to thereby execute various functional applications and data processing. The communication interface 103 may be used for communicating signaling or data with other node devices.

The Memory 101 may be, but is not limited to, a Random Access Memory 101 (RAM), a Read Only Memory 101 (ROM), a Programmable Read Only Memory 101 (PROM), an Erasable Read Only Memory 101 (EPROM), an electrically Erasable Read Only Memory 101 (EEPROM), and the like.

The processor 102 may be an integrated circuit chip having signal processing capabilities. The Processor 102 may be a general-purpose Processor 102, including a Central Processing Unit (CPU) 102, a Network Processor 102 (NP), and the like; but may also be a Digital Signal processor 102 (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware components.

In the embodiments provided in the present application, it should be understood that the disclosed method and system may be implemented in other ways. The method and system embodiments described above are merely illustrative and, for example, the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods and systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

In another aspect, embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, which, when executed by the processor 102, implements the method according to any one of the first aspect described above. The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory 101 (ROM), a Random Access Memory 101 (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

To sum up, according to the neural network model optimization method and system based on machine learning provided by the embodiment of the present application, the optimization layer is fused to reduce the computational dimension of the neural network model, and then the weight optimization processing is performed on the neural network model according to the weight range that the target hardware accelerator can accommodate, so that the target hardware accelerator is adapted to the neural network model. The method provided by the application solves the problem that the layout of the neural network model on a hardware accelerator is limited on the premise of improving the accuracy of the neural network model.

The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.

Claims

1. A neural network model optimization method based on machine learning is characterized by comprising the following steps:

acquiring the structure and parameters of a neural network model to be optimized, weight parameters corresponding to each layer and the number of neurons of each layer;

adding a virtual layer to the upper layer of a target optimization layer of a neural network model to be optimized, and converting the neural network model to be optimized, which is added to the virtual layer, into an initial model;

and fusing the initial model with the structure and parameters of the neural network model to be optimized, the weight parameters corresponding to each layer and the number of neurons of each layer, outputting a combined optimization strategy, and optimizing the neural network model to be optimized by using the combined optimization strategy.

2. The method of claim 1, wherein the obtaining of the structure and parameters of the neural network model to be optimized, the weight parameters corresponding to each layer, and the number of neurons in each layer comprises:

and setting a learning framework of the neural network model to be optimized, wherein the learning framework comprises a neural network model body to be optimized and an optimizer of the neural network model to be optimized, and the neural network model optimizer to be optimized is used for adjusting the structure and parameters of the neural network body corresponding to the neural network model optimizer, the weight parameters corresponding to each layer and the number of neurons of each layer.

3. The machine learning-based neural network model optimization method of claim 2, further comprising:

and acquiring the product of the number of the neurons of each layer in the neural network model to be optimized and the number of the neurons of the upper layer, and determining the layer which is larger than a preset threshold value in the calculated product as a target optimization layer.

4. The method of claim 1, wherein adding a virtual layer to an upper layer of a target optimization layer of the neural network model to be optimized, and converting the neural network model to be optimized added to the virtual layer into an initial model comprises:

and determining a virtual layer weight scale factor according to the weight coefficient of the virtual layer and the weight range which can be accommodated, and performing weight optimization processing on the virtual layer according to the scale factor and the weight of the virtual layer.

5. The method of machine learning-based neural network model optimization of claim 4, further comprising:

and adding a preset normalization layer after the virtual layer subjected to weight optimization processing.

6. The method of claim 1, wherein the fusing the initial model with the structure and parameters of the neural network model to be optimized, the weight parameter corresponding to each layer, and the number of neurons in each layer, and outputting a combinatorial optimization strategy, and the optimizing the neural network model to be optimized using the combinatorial optimization strategy comprises:

and arranging the neural network models after the fusion processing according to the corresponding output combination optimization strategy arrangement sequence so as to select the neural network model with smaller predicted performance loss from the output combination optimization strategy.

7. The machine-learning-based neural network model optimization method of claim 6, further comprising:

and selecting the neural network model ranked first in the ranking of the combination optimization strategy ranking sequence to output, or selecting the weighting results of a plurality of neural network models ranked at the top to output.

8. A machine learning based neural network model optimization system, comprising:

the acquisition module is used for acquiring the structure and parameters of the neural network model to be optimized, the weight parameter corresponding to each layer and the number of neurons of each layer;

the virtual layer module is used for adding a virtual layer to the upper layer of the target optimization layer of the neural network model to be optimized and converting the neural network model to be optimized, which is added into the virtual layer, into an initial model;

9. The machine-learning based neural network model optimization system of claim 8, comprising:

at least one memory for storing computer instructions;

at least one processor in communication with the memory, wherein the at least one processor, when executing the computer instructions, causes the system to perform: the device comprises an acquisition module, a virtual layer module and an optimization module.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-7.