CN115081624A - Neural network model adjusting method and device and computer readable storage medium - Google Patents

Neural network model adjusting method and device and computer readable storage medium Download PDF

Info

Publication number
CN115081624A
CN115081624A CN202210820863.8A CN202210820863A CN115081624A CN 115081624 A CN115081624 A CN 115081624A CN 202210820863 A CN202210820863 A CN 202210820863A CN 115081624 A CN115081624 A CN 115081624A
Authority
CN
China
Prior art keywords
data
neural network
network model
target
input data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210820863.8A
Other languages
Chinese (zh)
Inventor
瞿晓阳
王健宗
陶伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202210820863.8A priority Critical patent/CN115081624A/en
Publication of CN115081624A publication Critical patent/CN115081624A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method and a device for adjusting a neural network model and a computer readable storage medium, wherein the adjusting method comprises the following steps: acquiring test data, and inputting the test data into a preset first neural network model to obtain a first output result; obtaining a first adjustable variable from the first output result; acquiring first target input data corresponding to the first adjustable variable; and adjusting at least two first network layers in the first neural network model according to the first adjustable variable, the first target input data and a preset network layer adjustment rule to obtain a second neural network model. According to the technical scheme of the application, the network layer of the neural network model is adjusted through the first adjustable variable, the first target input data and the network layer adjusting rule, and a new neural network model with the reduced memory consumption can be effectively obtained.

Description

Neural network model adjusting method and device and computer readable storage medium
Technical Field
The present invention relates to, but not limited to, the technical field of artificial intelligence, and in particular, to a method and an apparatus for adjusting a neural network model, and a computer-readable storage medium.
Background
In recent years, with the rapid development of Artificial Intelligence (AI), neural network models are more and more widely applied in the field of Artificial Intelligence, and gradually emit light and heat in the fields of computer vision, speech recognition, natural language processing, and the like. Meanwhile, although the development of intelligent electronic devices has led to more and more types of electronic devices capable of operating a neural network model, Microcontrollers (MCUs) have become popular because of their advantages of being small, portable, cheap, and low in energy consumption, and the combination of MCUs and neural network models has become a popular trend. The neural network model has a large amount of data generated during operation, the network structure of the neural network model also needs memory occupation, but the MCU does not have a traditional cache-memory-hard disk storage framework, the memory resource is small, and the neural network model with large memory consumption requirement is difficult to support, so that the neural network model needs to be adjusted and optimized when the neural network model is deployed to the MCU for operation.
Disclosure of Invention
The following is a summary of the subject matter described in detail herein. This summary is not intended to limit the scope of the claims.
The embodiment of the invention provides a method and a device for adjusting a neural network model and a computer readable storage medium.
In a first aspect, an embodiment of the present invention provides a method for adjusting a neural network model, including:
acquiring test data, and inputting the test data into a preset first neural network model to obtain a first output result;
obtaining a first adjustable variable from the first output result;
acquiring first target input data corresponding to the first adjustable variable;
and adjusting at least two first network layers in the first neural network model according to the first adjustable variable, the first target input data and a preset network layer adjustment rule to obtain a second neural network model.
In some embodiments, the obtaining first target input data corresponding to the first adjustable variable comprises:
acquiring a preset relational mapping table, wherein the relational mapping table represents the mapping relation between the first output result and a network output layer, and the network output layer is the network layer for generating the output result;
determining a network output layer corresponding to the first adjustable variable in the relational mapping table as a target network output layer;
and obtaining the first target input data according to the target network output layer, wherein the first target input data is data input to the target network output layer.
In some embodiments, the first adjustable variable comprises at least two first variables; adjusting at least two first network layers in the neural network model according to the first adjustable variable, the target input data and a preset network layer adjustment rule to obtain a second neural network model, including:
determining a first target variable from at least two first variables, and acquiring first input data corresponding to the first target variable from the first target input data;
determining a network output layer corresponding to the first target variable in the relational mapping table as a first network output layer, and acquiring a first network layer identifier, wherein the first network layer identifier is a parameter uniquely identifying the first network output layer;
determining the sum of the memory occupancy value of the first adjustable variable and the memory occupancy value of the first input data as a first memory occupancy value;
deleting the first target variable in the first adjustable variables, and performing data merging processing on the first adjustable variables after the first target variable is deleted and the first input data to obtain a first intermediate data set, wherein the first intermediate data set comprises second adjustable variables, and the second adjustable variables comprise at least two second variables;
determining a second target variable from at least two second variables, and acquiring second target input data corresponding to the second target variable;
determining the sum of the memory occupancy value of the second adjustable variable and the memory occupancy value of the second input data as a second memory occupancy value;
and adjusting at least two first network layers in the neural network model according to the first memory occupancy value, the second memory occupancy value and the first network layer identifier to obtain the second neural network.
In some embodiments, after the adjusting at least two first network layers in the neural network model according to the first memory footprint value, the second memory footprint value, and the first network layer identifier to obtain the second neural network, the method further includes:
when the number of the second variables remaining in the second adjustable variables is larger than a preset first threshold value, determining the second adjustable variables as new first adjustable variables;
acquiring new first target input data corresponding to the new first adjustable variable;
and adjusting at least two second network layers in the second neural network model according to the new first adjustable variable, the new first target input data and the network layer adjustment rule to obtain a new second neural network model.
In some embodiments, the second neural network model includes at least two second network layers connected in sequence, and after the at least two first network layers in the first neural network model are adjusted according to the first adjustable variable, the target input data, and a preset network layer adjustment rule, the second neural network model further includes:
acquiring a third memory occupation value corresponding to each second network layer;
determining a boundary network layer from at least two second network layers, wherein the ratio of the sum of the third memory occupation values corresponding to the second network layers which are sequenced before the boundary network layer to the sum of the third memory occupation values corresponding to the second network layers which are sequenced after the boundary network layer is greater than a preset second threshold value.
In some embodiments, after said determining a border network layer from at least two of said second network layers, said method further comprises:
acquiring data to be processed, and inputting the data to be processed into the second neural network model to obtain a second output result, wherein the second output result comprises at least two optional output data;
acquiring optional input data corresponding to each optional output data;
acquiring target output data from at least two optional output data, wherein the network output layer corresponding to the target output data is the second network layer which is positioned in front of the boundary network layer in sequence;
acquiring third target input data, wherein the third target input data is input data corresponding to the target output data;
performing data merging processing on the target output data and the third target input data to obtain a second intermediate data set;
classifying the data in the second intermediate data set according to a preset data classification rule to obtain first intermediate data and second intermediate data, wherein the task target of the first intermediate data is an IO task, and the task target of the second intermediate data is a calculation task;
dividing a storage device into a first storage area and a second storage area, wherein the storage device is used for running the second neural network model;
sending the first intermediate data to the first storage area to execute input/output (IO) operation;
and sending the second intermediate data to the second storage area to execute data operation.
In some embodiments, prior to said inputting said data to be processed into said second neural network model, said method further comprises:
acquiring a preset data cleaning rule;
and carrying out data cleaning on the data to be processed according to the data cleaning rule to obtain new data to be processed.
In a second aspect, an embodiment of the present invention provides an apparatus for adjusting a neural network model, including:
the first output result acquisition module is used for acquiring test data, inputting the test data to a preset first neural network model and obtaining a first output result;
a first adjustable variable obtaining module, configured to obtain a first adjustable variable from the first output result;
a first target input data acquisition module, configured to acquire first target input data corresponding to the first adjustable variable;
and the neural network model adjusting module is used for adjusting at least two first network layers in the first neural network model according to the first adjustable variable, the first target input data and a preset network layer adjusting rule to obtain a second neural network model.
In a third aspect, an embodiment of the present invention further provides an apparatus for adjusting a neural network model, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of adjusting a neural network model according to the first aspect when executing the computer program.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, which stores a computer program for executing the method for adjusting a neural network model according to the first aspect.
The embodiment of the invention comprises the following steps: acquiring test data, and inputting the test data into a preset first neural network model to obtain a first output result; obtaining a first adjustable variable from the first output result; acquiring first target input data corresponding to the first adjustable variable; and adjusting at least two first network layers in the first neural network model according to the first adjustable variable, the first target input data and a preset network layer adjustment rule to obtain a second neural network model. According to the scheme provided by the embodiment of the invention, the network layer of the neural network model is adjusted through the first adjustable variable, the first target input data and the network layer adjusting rule, so that a new neural network model with reduced memory consumption can be effectively obtained.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the present invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the example serve to explain the principles of the invention and do not constitute a limitation thereof.
FIG. 1 is a flow chart of the steps of a method for tuning a neural network model according to an embodiment of the present invention;
FIG. 2 is a flowchart of the steps provided by another embodiment of the present invention to obtain first target input data;
FIG. 3 is a flowchart illustrating steps of a method for tuning a neural network model according to another embodiment of the present invention;
FIG. 4 is a flow chart illustrating steps of a method for tuning a neural network model according to another embodiment of the present invention;
FIG. 5 is a flowchart illustrating the steps provided by another embodiment of the present invention for determining a border network layer;
FIG. 6 is a flow chart of steps provided by another embodiment of the present invention for running a second neural network model;
FIG. 7 is a flowchart illustrating steps for data cleansing of data to be processed according to another embodiment of the present invention;
FIG. 8 is a schematic diagram of network layer reordering as provided in another embodiment of the present invention;
FIG. 9 is a block diagram of an apparatus for adjusting a neural network model according to another embodiment of the present invention;
fig. 10 is a block diagram of an adjusting apparatus of a neural network model according to another embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
It should be noted that although functional blocks are partitioned in a schematic diagram of an apparatus and a logical order is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the partitioning of blocks in the apparatus or the order in the flowchart. The terms "first," "second," and the like in the description, in the claims, or in the drawings described above, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
The invention provides a method and a device for adjusting a neural network model and a computer readable storage medium, wherein the adjusting method comprises the following steps: acquiring test data, and inputting the test data into a preset first neural network model to obtain a first output result; obtaining a first adjustable variable from the first output result; acquiring first target input data corresponding to the first adjustable variable; and adjusting at least two first network layers in the first neural network model according to the first adjustable variable, the first target input data and a preset network layer adjustment rule to obtain a second neural network model. According to the technical scheme of the application, the network layer of the neural network model is adjusted through the first adjustable variable, the first target input data and the network layer adjusting rule, and a new neural network model with the reduced memory consumption can be effectively obtained.
The embodiments of the present invention will be further explained with reference to the drawings.
As shown in fig. 1, fig. 1 is a flowchart illustrating steps of a method for adjusting a neural network model according to an embodiment of the present invention, where the method for reporting a service indicator includes, but is not limited to, the following steps:
step S110, acquiring test data, inputting the test data to a preset first neural network model, and obtaining a first output result;
step S120, obtaining a first adjustable variable from the first output result;
step S130, acquiring first target input data corresponding to a first adjustable variable;
step S140, adjusting at least two first network layers in the first neural network model according to the first adjustable variable, the first target input data, and a preset network layer adjustment rule, to obtain a second neural network model.
In addition, the embodiment of the present application does not limit the specific model structure of the first neural network model, and may be a MobileNet lightweight network model, an AlexNet network model, a VGG16 network model, or the like.
It should be noted that, in the embodiment of the present application, the data format of the first adjustable variable and the first target input data is not limited, and the data format of the first adjustable variable and the first target input data may be a tensor, and at the same time, the data format of the first adjustable variable and the first target input data is not limited to a specific tensor form, and may be a 0-dimensional tensor, that is, a scalar, a 1-dimensional tensor, that is, a vector, a two-dimensional tensor, that is, a matrix, or a high-dimensional tensor, which is not described in detail herein.
It can be understood that, in the embodiment of the present application, the first adjustable variable is obtained from the first output result of the first neural network model, the first target input data corresponding to the first adjustable variable and the preset network layer adjustment rule are used to adjust and optimize at least two first network layers in the first neural network model, so as to obtain the second neural network model with reduced memory occupancy, and thus, the neural network model can be better deployed on the electronic device. The embodiment of the application does not limit the specific structure of the electronic device running the second neural network model, and the electronic device can be a mobile phone terminal or an MCU.
It should be noted that, in the embodiment of the present application, specific operations of adjusting at least two first network layers of the first neural network model are not limited, and adjusting the first network layers may be, based on a preset network layer adjustment rule, performing fusion of a plurality of first network layers, or deleting a certain first network layer, or reordering the first network layers.
Referring also to fig. 2, in an embodiment, step S130 in the embodiment shown in fig. 1 includes, but is not limited to, the following steps:
step S210, acquiring a preset relational mapping table, wherein the relational mapping table represents the mapping relation between the first output result and a network output layer, and the network output layer is a network layer for generating the output result;
step S220, determining a network output layer corresponding to the first adjustable variable in the relational mapping table as a target network output layer;
step S230, obtaining first target input data according to the target network output layer, where the first target input data is data input to the target network output layer.
It can be understood that by obtaining a relational mapping table representing a mapping relationship between the first output result and the network output layer, determining a target network output layer corresponding to the first adjustable variable from the relational mapping table, and obtaining first target input data input to the target network output layer, an effective data basis can be provided for adjusting the network layer in the first neural network model.
It should be noted that, the first target input data in the embodiment of the present application may be output data of a network layer that is ordered and located above a target network output layer, or may be test data that is initially input to the first neural network model, which is not limited herein.
Additionally, referring to FIG. 3, in one embodiment, the first adjustable variable includes at least two first variables; step S140 in the embodiment shown in fig. 1 includes, but is not limited to, the following steps:
step S310, determining a first target variable from at least two first variables, and acquiring first input data corresponding to the first target variable from the first target input data;
step S320, determining a network output layer corresponding to the first target variable in the relational mapping table as a first network output layer, and acquiring a first network layer identifier, wherein the first network layer identifier is a parameter uniquely identifying the first network output layer;
step S330, determining a sum of the memory occupancy value of the first adjustable variable and the memory occupancy value of the first input data as a first memory occupancy value;
step S340, deleting a first target variable in the first adjustable variables, and performing data merging processing on the first adjustable variable after the first target variable is deleted and the first input data to obtain a first intermediate data set, where the first intermediate data set includes a second adjustable variable, and the second adjustable variable includes at least two second variables;
step S350, determining a second target variable from at least two second variables, and acquiring second target input data corresponding to the second target variable;
step S360, determining the sum of the memory occupancy value of the second adjustable variable and the memory occupancy value of the second input data as a second memory occupancy value;
step S370, adjusting at least two first network layers in the neural network model according to the first memory occupancy value, the second memory occupancy value, and the first network layer identifier, to obtain a second neural network.
In addition, referring to fig. 4, in an embodiment, after performing step S370 in the embodiment shown in fig. 3, the method for adjusting the neural network model further includes, but is not limited to, the following steps:
step S410, when the number of the remaining second variables in the second adjustable variables is larger than a preset first threshold value, determining the second adjustable variables as new first adjustable variables;
step S420, acquiring new first target input data corresponding to the new first adjustable variable;
step S430, adjusting at least two second network layers in the second neural network model according to the new first adjustable variable, the new first target input data and the network layer adjustment rule, so as to obtain a new second neural network model.
It is understood that, in order to describe the steps of adjusting the first network layer in the first neural network model and the second network layer in the second neural network model more clearly, the following is described as a specific example: firstly, obtaining a first output result, wherein the first output result S1 comprises a constant data set CS and a first adjustable variable AS1, selecting a first target variable x1 from AS1, obtaining a network output layer I1 corresponding to x, determining a first network layer identifier of l1, obtaining an input data set IS1 of I1, and calculating the sum of memory occupation of IS1 and AS1 to obtain a first memory occupation value P; deleting x1 from the AS, merging the data of the AS1 and the IS1 after deleting x1 to obtain a first intermediate data set S2, wherein S2 comprises AS2, obtaining a second target variable x2 from the AS2, obtaining a network output layer I2 corresponding to x2, obtaining an input data set IS2 of I2, calculating the sum of the memory occupation of IS2 and AS2 to obtain a second memory occupation value Q, comparing P and Q, determining a network layer l1 to be a target network layer when P IS smaller than Q, selecting l1 through a first network layer identifier of l1 when sequencing and scheduling a first neural network model, and determining l1 to be a first network layer of a second neural network model; then, continuing to delete x2 from the AS2, repeating the data combination and calculating a new first memory occupation value and a new second memory occupation value, when the new first memory occupation value is smaller than the new second memory occupation value, obtaining a second network layer identifier of a network output layer l2 corresponding to x2, determining that the network output layer l2 corresponding to x2 is a new target network layer, and confirming that l2 is a second layer network layer of the second neural network model until the adjustable variable is deleted, namely the number of the second variable x22 in the AS2 is equal to a first threshold value which is 0, at this time, outputting the memory occupation value of the CS, completing the ordering of the first network layer of the first neural network model, and obtaining the second neural network model.
In the network layer sequencing of the embodiment of the present application, referring to fig. 8, it is assumed that the memory occupancy value of the network layer 810 is 50, the memory occupancy value of the network layer 820 is 40, the memory occupancy value of the network layer 830 is 30, the memory occupancy value of the network layer 840 is 35, the memory occupancy value of the network layer 850 is 40, the memory occupancy value of the network layer 860 is 50, the memory occupancy value of the network layer 870 is 25, and the memory occupancy value of the network layer 880 is 20; the output activation value of the network layer 810 is sent to two branches, and finally, the two branches are merged at the network layer 880, and the network layer which has to wait for the completion of the scheduling of the subsequent network layer can release the previous network layer, and when the scheduling sequence of 810, 820,830,840, 850,860,870 and 880 is adopted, the peak memory occupation will appear at the network layer 860, and is 125 (40 of the network layer 850 + 50 of the network layer 860 + 35 of the network layer 840); if 810,850,860,870,820,830,840,880 is adopted, the peak memory occupation will also appear in the network layer 860, and the memory occupation value is 140 (40 of the network layer 850 + 50 of the network layer 860 + 50 of the network layer 810), so it can be seen that we can effectively reduce the peak memory occupation by reordering through the rational arrangement operator.
In addition, referring to fig. 5, in an embodiment, the second neural network model includes at least two second network layers connected in sequence, and after performing step S140 in the embodiment shown in fig. 1, the method for adjusting the neural network model further includes, but is not limited to, the following steps:
step S510, obtaining a third memory occupancy value corresponding to each second network layer;
step S520, a boundary network layer is determined from the at least two second network layers, and a ratio of a sum of third memory occupied values corresponding to each second network layer before the boundary network layer and a sum of third memory occupied values corresponding to each second network layer after the boundary network layer is determined to be greater than a preset second threshold.
It can be understood that the network layers in the convolutional neural network can be roughly classified into computation intensive layers and IO intensive layers, the computation intensive layers mainly include convolutional layers, the IO intensive layers mainly include fully-connected layers and depth-separable convolutional layers, and the remaining network layers such as pooling layers and non-linear layers are not high in cost and do not need to consider the memory occupation values of the remaining network layers. The execution time of the convolutional neural network is determined by the calculation time of the calculation intensive layer and the IO time of the IO intensive layer, the calculation intensive layer is mainly concentrated on the first layers of the whole network, and the memory of the layers occupies most of the memory of the whole network, so that the boundary network layer between the calculation intensive network layer with a large memory occupation value and the network layer with a small memory occupation value is determined, and an effective data basis can be provided for the network layer with the large memory occupation value.
It should be noted that, the embodiment of the present application does not limit the specific value of the second threshold, and a person skilled in the art can determine the second threshold according to practical situations.
In addition, referring to fig. 6, in an embodiment, after performing step S520 in the embodiment shown in fig. 5, the method for adjusting the neural network model further includes, but is not limited to, the following steps:
step S610, acquiring data to be processed, inputting the data to be processed into a second neural network model, and obtaining a second output result, wherein the second output result comprises at least two optional output data;
step S620, acquiring optional input data corresponding to each optional output data;
step S630, obtaining target output data from at least two optional output data, wherein the network output layer corresponding to the target output data is a second network layer which is sequenced and positioned in front of the boundary network layer;
step S640, acquiring third target input data, where the third target input data is input data corresponding to the target output data;
step S650, carrying out data merging processing on the target output data and the third target input data to obtain a second intermediate data set;
step S660, classifying the data in the second intermediate data set according to a preset data classification rule to obtain first intermediate data and second intermediate data, wherein the task target of the first intermediate data is an IO task, and the task target of the second intermediate data is a data operation task;
step S670, dividing a storage device into a first storage area and a second storage area, wherein the storage device is used for operating a second neural network model;
step S680, sending the first intermediate data to a first storage area to execute input/output (IO) operation;
in step S690, the second intermediate data is sent to the second storage area to perform data operation.
It can be understood that, under the condition that the MCU runs the second neural network model, the memory of the MCU is divided into a first storage area and a second storage area, the target output data and the third target input data of the second network layer located in front of the boundary network layer are obtained and sorted, the target output data and the third target input data are merged to obtain a second intermediate data set, the data in the second intermediate data set are classified according to the preset data classification rule to obtain a first intermediate data and a second intermediate data, and the first intermediate data is sent to the first storage area to perform the input/output IO operation (i.e., the data exchange between the memory and the flash memory, including the task of reading input and weight from the external flash memory into the memory, and the task of writing output from the memory to the external flash memory); sending the second intermediate data to a second storage area to perform data operation, so as to implement parallel data operation tasks and IO tasks, where the scheme provided in this embodiment may be that the MCU maintains two queues, one is an IO queue, and the other is a data operation computer queue; the calculation process of sequencing a network layer positioned in front of the position of a boundary network layer is divided into a plurality of reading task reads, a plurality of writing task write tasks and a plurality of data operation task computer tasks, each task has a dependency relationship, one computer task can read related input data and weight by relying on the reading task, the writing task can rely on the output data of the computer task, the task of reading the weight can rely on the task of reading the input, when all the dependencies of one task are satisfied, the computer task enters a preparation state, the reading task and the writing task enter an io queue, the computer task enters the computer queue, the two queues are continuously checked in parallel, when the task is detected to be in the preparation state, the task is executed, and the queue is ejected after the execution. In the embodiment of the application, on the basis of obtaining the second neural network model for reducing the memory occupation value, the second neural network model performs the data operation task and the IO task on the MCU in parallel, so that the IO time delay can be further reduced.
It should be noted that the embodiment of the present application does not limit the specific model of the MCU, and may be a small-scale MCU of STM32, and according to the technical solution of the present application, data exchange between the memory and the external flash memory is performed by optimizing the scheduling order of the network layers in the neural network model and ordering the first few layers of the network layers with intensive computation in the neural network to the boundary network layer, so that the peak memory can be reduced, and excessive delay overhead is not incurred.
In addition, referring to fig. 7, in an embodiment, before step S610 in the embodiment shown in fig. 6, the method for adjusting the neural network model further includes, but is not limited to, the following steps:
step S710, acquiring a preset data cleaning rule;
and S720, performing data cleaning on the data to be processed according to the data cleaning rule to obtain new data to be processed.
It can be understood that, before the data to be processed is input into the second neural network model, the data to be processed is subjected to data cleaning, so that the output result of the second neural network model is higher in availability, and the new data to be processed after the data cleaning can provide an effective data base for adjusting the second neural network model.
In addition, referring to fig. 9, fig. 9 is a block diagram illustrating an adjusting apparatus of a neural network model according to another embodiment of the present invention, and an embodiment of the present invention further provides an adjusting apparatus 900 of a neural network model, where the adjusting apparatus 900 of a neural network model includes:
a first output result obtaining module 910, where the first output result obtaining module 910 is configured to obtain test data, and input the test data to a preset first neural network model to obtain a first output result;
a first adjustable variable obtaining module 920, where the first adjustable variable obtaining module 920 is configured to obtain a first adjustable variable from the first output result;
a first target input data obtaining module 930, where the first target input data obtaining module 930 is configured to obtain first target input data corresponding to a first adjustable variable;
a neural network model adjusting module 940, where the neural network model adjusting module 940 is configured to adjust at least two first network layers in the first neural network model according to the first adjustable variable, the first target input data, and a preset network layer adjusting rule, so as to obtain a second neural network model.
In addition, referring to fig. 10, fig. 10 is a structural diagram of an adjusting apparatus of a neural network model according to another embodiment of the present invention, and an embodiment of the present invention further provides an adjusting apparatus 1000 of a neural network model, where the adjusting apparatus 1000 of a neural network model includes: a memory 1010, a processor 1020, and computer programs stored on the memory 1010 and executable on the processor 1020.
The processor 1020 and the memory 1010 may be connected by a bus or other means.
The non-transitory software program and instructions required to implement the neural network model adjustment method of the above-mentioned embodiment are stored in the memory 1010, and when being executed by the processor 1020, the neural network model adjustment method of the above-mentioned embodiment is executed, for example, the method steps S110 to S140 in fig. 1, the method steps S210 to S230 in fig. 2, the method steps S310 to S370 in fig. 3, the method steps S410 to S430 in fig. 4, the method steps S510 to S520 in fig. 5, the method steps S610 to S690 in fig. 6, and the method steps S710 to S720 in fig. 7, which are described above, are executed.
The above-described embodiments of the apparatus are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may also be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, an embodiment of the present invention further provides a computer-readable storage medium, which stores computer-executable instructions, which are executed by a processor or a controller, for example, by a processor 920 in the embodiment of the apparatus 1000 for adjusting a neural network model, so that the processor can execute the method for adjusting a neural network model in the above-described embodiment, for example, execute the above-described method steps S110 to S140 in fig. 1, the method steps S210 to S230 in fig. 2, the method steps S310 to S370 in fig. 3, the method steps S410 to S430 in fig. 4, the method steps S510 to S520 in fig. 5, the method steps S610 to S690 in fig. 6, and the method steps S710 to S720 in fig. 7. One of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.
While the preferred embodiments of the present invention have been described in detail, it will be understood by those skilled in the art that the foregoing and various other changes, omissions and deviations in the form and detail thereof may be made without departing from the scope of this invention.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. Each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more programs for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based apparatus that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present application may be implemented by software, or may be implemented by hardware, and the described units may also be disposed in a processor. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves.
It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functions of two or more modules or units described above may be embodied in one module or unit according to embodiments of the application. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present application can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which can be a personal computer, a server, a touch terminal, or a network device, etc.) to execute the method according to the embodiments of the present application.
The terminal of this embodiment may include: radio Frequency (RF) circuit, memory, input unit, display unit, sensor, audio circuit, wireless fidelity (WiFi) module, processor, and power supply. The RF circuit can be used for receiving and transmitting signals in the process of information receiving and transmitting or conversation, and particularly, the downlink information of the base station is received and then is processed by the processor; in addition, the data for designing uplink is transmitted to the base station. Typically, the RF circuit includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuitry may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Message Service (SMS), and the like. The memory may be used to store software programs and modules, and the processor may execute various functional applications of the terminal and data processing by operating the software programs and modules stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the terminal, etc. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. The input unit may be used to receive input numeric or character information and generate key signal inputs related to settings and function control of the terminal. Specifically, the input unit may include a touch panel and other input devices. The touch panel, also called a touch screen, can collect touch operations (such as operations on or near the touch panel using any suitable object or accessory, such as a finger, a stylus, etc.) thereon or nearby, and drive the corresponding connection device according to a preset program. Alternatively, the touch panel may include two parts, a touch detection device and a touch controller. The touch detection device detects a touch direction, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor, and can receive and execute commands sent by the processor. In addition, the touch panel may be implemented by various types such as resistive, capacitive, infrared, and surface acoustic wave. The input unit may include other input devices in addition to the touch panel. In particular, other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like. The display unit may be used to display input information or provided information and various menus of the terminal. The Display unit may include a Display panel, and optionally, the Display panel may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel may cover the display panel, and when the touch panel detects a touch operation thereon or nearby, the touch panel transmits the touch operation to the processor to determine a category of the touch event, and then the processor provides a corresponding visual output on the display panel according to the category of the touch event. The touch panel and the display panel are two separate components to implement the input and output functions of the terminal, but in some embodiments, the touch panel and the display panel may be integrated to implement the input and output functions of the terminal. The terminal may also include at least one sensor, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor that may adjust the brightness of the display panel according to the brightness of ambient light, and a proximity sensor that may turn off the display panel and/or the backlight when the terminal is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing the terminal posture (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer, tapping), and the like; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured in the terminal, detailed description is omitted here. Audio circuitry, speakers, microphones may provide an audio interface. The audio circuit can transmit the electric signal converted from the received audio data to the loudspeaker, and the electric signal is converted into a sound signal by the loudspeaker to be output; on the other hand, the microphone converts the collected sound signal into an electric signal, which is received by the audio circuit and converted into audio data, which is then output to the processor for processing, and then transmitted to, for example, another terminal via the RF circuit, or the audio data is output to the memory for further processing.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the embodiments disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (10)

1. A method for adjusting a neural network model, comprising:
acquiring test data, and inputting the test data into a preset first neural network model to obtain a first output result;
obtaining a first adjustable variable from the first output result;
acquiring first target input data corresponding to the first adjustable variable;
and adjusting at least two first network layers in the first neural network model according to the first adjustable variable, the first target input data and a preset network layer adjustment rule to obtain a second neural network model.
2. The method of claim 1, wherein obtaining the first target input data corresponding to the first adjustable variable comprises:
acquiring a preset relational mapping table, wherein the relational mapping table represents the mapping relation between the first output result and a network output layer, and the network output layer is the network layer for generating the output result;
determining a network output layer corresponding to the first adjustable variable in the relational mapping table as a target network output layer;
and obtaining the first target input data according to the target network output layer, wherein the first target input data is data input to the target network output layer.
3. The method of claim 2, wherein the first adjustable variable comprises at least two first variables; adjusting at least two first network layers in the neural network model according to the first adjustable variable, the target input data and a preset network layer adjustment rule to obtain a second neural network model, including:
determining a first target variable from at least two first variables, and acquiring first input data corresponding to the first target variable from the first target input data;
determining a network output layer corresponding to the first target variable in the relational mapping table as a first network output layer, and acquiring a first network layer identifier, wherein the first network layer identifier is a parameter uniquely identifying the first network output layer;
determining the sum of the memory occupancy value of the first adjustable variable and the memory occupancy value of the first input data as a first memory occupancy value;
deleting the first target variable in the first adjustable variables, and performing data merging processing on the first adjustable variables after the first target variable is deleted and the first input data to obtain a first intermediate data set, wherein the first intermediate data set comprises second adjustable variables, and the second adjustable variables comprise at least two second variables;
determining a second target variable from at least two second variables, and acquiring second target input data corresponding to the second target variable;
determining the sum of the memory occupancy value of the second adjustable variable and the memory occupancy value of the second input data as a second memory occupancy value;
and adjusting at least two first network layers in the neural network model according to the first memory occupation value, the second memory occupation value and the first network layer identification to obtain the second neural network.
4. The method of claim 3, wherein after the adjusting at least two of the first network layers in the neural network model according to the first memory footprint, the second memory footprint, and the first network layer identification to obtain the second neural network, the method further comprises:
when the number of the second variables remaining in the second adjustable variables is larger than a preset first threshold value, determining the second adjustable variables as new first adjustable variables;
acquiring new first target input data corresponding to the new first adjustable variable;
and adjusting at least two second network layers in the second neural network model according to the new first adjustable variable, the new first target input data and the network layer adjustment rule to obtain a new second neural network model.
5. The method of claim 1, wherein the second neural network model comprises at least two second network layers connected in sequence, and after the at least two first network layers in the first neural network model are adjusted according to the first adjustable variable, the target input data and a preset network layer adjustment rule to obtain the second neural network model, the method further comprises:
acquiring a third memory occupation value corresponding to each second network layer;
determining a boundary network layer from at least two second network layers, wherein the ratio of the sum of the third memory occupation values corresponding to the second network layers which are sequenced before the boundary network layer to the sum of the third memory occupation values corresponding to the second network layers which are sequenced after the boundary network layer is greater than a preset second threshold value.
6. The method of claim 5, wherein after said determining a border network layer from among at least two of said second network layers, said method further comprises:
acquiring data to be processed, and inputting the data to be processed into the second neural network model to obtain a second output result, wherein the second output result comprises at least two optional output data;
acquiring optional input data corresponding to each optional output data;
acquiring target output data from at least two optional output data, wherein the network output layer corresponding to the target output data is the second network layer which is positioned in front of the boundary network layer in sequence;
acquiring third target input data, wherein the third target input data is input data corresponding to the target output data;
performing data merging processing on the target output data and the third target input data to obtain a second intermediate data set;
classifying the data in the second intermediate data set according to a preset data classification rule to obtain first intermediate data and second intermediate data, wherein the task target of the first intermediate data is an IO task, and the task target of the second intermediate data is a calculation task;
dividing a storage device into a first storage area and a second storage area, wherein the storage device is used for running the second neural network model;
sending the first intermediate data to the first storage area to execute input/output (IO) operation;
and sending the second intermediate data to the second storage area to execute data operation.
7. The method of claim 6, wherein prior to said inputting said data to be processed into said second neural network model, said method further comprises:
acquiring a preset data cleaning rule;
and carrying out data cleaning on the data to be processed according to the data cleaning rule to obtain new data to be processed.
8. An apparatus for adjusting a neural network model, comprising:
the first output result acquisition module is used for acquiring test data, inputting the test data to a preset first neural network model and obtaining a first output result;
a first adjustable variable obtaining module, configured to obtain a first adjustable variable from the first output result;
a first target input data acquisition module, configured to acquire first target input data corresponding to the first adjustable variable;
and the neural network model adjusting module is used for adjusting at least two first network layers in the first neural network model according to the first adjustable variable, the first target input data and a preset network layer adjusting rule to obtain a second neural network model.
9. An apparatus for adjusting a neural network model, comprising: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of adjusting a neural network model according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium storing computer-executable instructions for performing the method of tuning a neural network model of any one of claims 1-7.
CN202210820863.8A 2022-07-13 2022-07-13 Neural network model adjusting method and device and computer readable storage medium Pending CN115081624A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210820863.8A CN115081624A (en) 2022-07-13 2022-07-13 Neural network model adjusting method and device and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210820863.8A CN115081624A (en) 2022-07-13 2022-07-13 Neural network model adjusting method and device and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN115081624A true CN115081624A (en) 2022-09-20

Family

ID=83259610

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210820863.8A Pending CN115081624A (en) 2022-07-13 2022-07-13 Neural network model adjusting method and device and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN115081624A (en)

Similar Documents

Publication Publication Date Title
CN106919918B (en) Face tracking method and device
CN106778585B (en) A kind of face key point-tracking method and device
EP3441874B1 (en) Scene sound effect control method, and electronic device
CN108984064B (en) Split screen display method and device, storage medium and electronic equipment
CN111368934A (en) Image recognition model training method, image recognition method and related device
CN107659637B (en) Sound effect setting method and device, storage medium and terminal
CN111209423B (en) Image management method and device based on electronic album and storage medium
US20170109756A1 (en) User Unsubscription Prediction Method and Apparatus
CN110162604B (en) Statement generation method, device, equipment and storage medium
CN105373534B (en) List display method and device and list display terminal
CN109062468B (en) Split screen display method and device, storage medium and electronic equipment
CN107391198B (en) Method for scheduling task and device, computer readable storage medium, mobile terminal
CN112084959B (en) Crowd image processing method and device
CN112036492B (en) Sample set processing method, device, equipment and storage medium
CN110798718A (en) Video recommendation method and device
EP3429176A1 (en) Scenario-based sound effect control method and electronic device
CN113592209A (en) Model training task management method, device, terminal and storage medium
CN109067981A (en) Split screen application switching method, device, storage medium and electronic equipment
CN107330867B (en) Image synthesis method, image synthesis device, computer-readable storage medium and computer equipment
CN106502833A (en) Data back up method and device
CN105512150A (en) Method and device for information search
CN109062469B (en) Split screen display method and device, storage medium and electronic equipment
CN115841575A (en) Key point detection method, device, electronic apparatus, storage medium, and program product
CN107193551B (en) Method and device for generating image frame
CN104102560A (en) Method and device for testing system performance

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination