CN111563593B

CN111563593B - Training method and device for neural network model

Info

Publication number: CN111563593B
Application number: CN202010383383.0A
Authority: CN
Inventors: 希滕; 张刚; 温圣召
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-05-08
Filing date: 2020-05-08
Publication date: 2023-09-15
Anticipated expiration: 2040-05-08
Also published as: CN111563593A

Abstract

The application relates to the field of artificial intelligence and discloses a training method and device for a neural network model. The method includes performing the following search operations: determining a current truncation strategy from a search space of a preset truncation strategy according to a preset truncation strategy controller, wherein the truncation strategy comprises parameters of a neural network model or truncated bit numbers in binary representation of intermediate output data; performing iterative training on a target neural network model to be trained based on a current truncation strategy, acquiring the performance of the target neural network model which is trained based on the current truncation strategy, and generating corresponding feedback information; and in response to determining that the target neural network model trained based on the current truncation strategy does not reach a preset convergence condition, iteratively updating the truncation strategy controller based on the feedback information to perform the next search operation based on the updated truncation strategy controller. The neural network model trained by the method has small precision loss after quantization.

Description

Training method and device for neural network model

Technical Field

The embodiment of the disclosure relates to the technical field of computers, in particular to the technical field of artificial intelligence, and particularly relates to a training method and device of a neural network model.

Background

The quantization of the neural network model is to convert model parameters with high bit width into model parameters with low bit width, so as to improve the calculation speed of the model. Quantization is typically performed after training of the high bit-width neural network model is completed. The low-bit-width neural network model obtained after quantization is directly used for executing corresponding deep learning tasks. However, the loss of accuracy of the quantized parameters may be greater than an acceptable range due to the large loss of accuracy of the quantized model.

Disclosure of Invention

Embodiments of the present disclosure provide a training method and apparatus for a neural network model, an electronic device, and a computer-readable storage medium.

According to a first aspect, there is provided a training method of a neural network model, including performing a search operation of: determining a current truncation strategy from a search space of a preset truncation strategy according to a preset truncation strategy controller, wherein the truncation strategy comprises parameters of a neural network model or truncated bit numbers in binary representation of intermediate output data; iterative training is carried out on the target neural network model to be trained based on the current truncation strategy, wherein in each iteration in the training process, the prediction result and the loss function value of the target neural network model to be trained are generated after the parameters of the target neural network model to be trained or the binary representation of intermediate output data are truncated according to the current truncation strategy, and the parameters of the target neural network model to be trained are updated by forward propagating the loss function value; acquiring the performance of a target neural network model trained based on the current truncation strategy and generating corresponding feedback information; and in response to determining that the target neural network model trained based on the current truncation strategy does not reach a preset convergence condition, iteratively updating the truncation strategy controller based on the feedback information to perform the next search operation based on the updated truncation strategy controller.

According to a second aspect, there is provided a training apparatus of a neural network model, including a search unit configured to perform a search operation; the search unit includes: a determining unit configured to perform the following steps in the search operation: determining a current truncation strategy from a search space of a preset truncation strategy according to a preset truncation strategy controller, wherein the truncation strategy comprises parameters of a neural network model or truncated bit numbers in binary representation of intermediate output data; a training unit configured to perform the following steps in the search operation: iterative training is carried out on the target neural network model to be trained based on the current truncation strategy, wherein in each iteration in the training process, the prediction result and the loss function value of the target neural network model to be trained are generated after the parameters of the target neural network model to be trained or the binary representation of intermediate output data are truncated according to the current truncation strategy, and the parameters of the target neural network model to be trained are updated by forward propagating the loss function value; an acquisition unit configured to perform the following steps in the search operation: acquiring the performance of a target neural network model trained based on the current truncation strategy and generating corresponding feedback information; an updating unit configured to perform the following steps in the search operation: and in response to determining that the target neural network model trained based on the current truncation strategy does not reach a preset convergence condition, iteratively updating the truncation strategy controller based on the feedback information to perform the next search operation based on the updated truncation strategy controller.

According to a third aspect, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of training the neural network model provided in the first aspect.

According to a fourth aspect, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the training method of the neural network model provided in the first aspect.

According to the technology provided by the application, the neural network model obtained by training is insensitive to quantization by searching the optimal cut-off strategy in the training process of the neural network model, so that the accuracy loss of the neural network model obtained by training after quantization is smaller.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

Other features, objects and advantages of the present disclosure will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the following drawings:

FIG. 1 is a flow chart of one embodiment of a training method of a neural network model of the present disclosure;

FIG. 2 is a flow chart of another embodiment of a training method of the neural network model of the present disclosure;

FIG. 3 is a flow chart of yet another embodiment of a training method of the neural network model of the present disclosure;

FIG. 4 is a schematic diagram of the architecture of one embodiment of a training apparatus of the neural network model of the present disclosure;

fig. 5 is a block diagram of an electronic device used to implement a training method of a neural network model of an embodiment of the present disclosure.

Detailed Description

The present disclosure is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings.

It should be noted that, without conflict, the embodiments of the present disclosure and features of the embodiments may be combined with each other. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Exemplary embodiments of the present application will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present application are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The method or apparatus of the present disclosure may be applied to a terminal device or a server, or may be applied to a system architecture including a terminal device, a network, and a server. The medium used by the network to provide a communication link between the terminal device and the server may include various connection types, such as a wired, wireless communication link, or fiber optic cable, among others.

The terminal device may be a user end device on which various client applications may be installed. Such as image processing class applications, search applications, voice service class applications, etc. The terminal device may be hardware or software. When the terminal device is hardware, it may be a variety of electronic devices including, but not limited to, smartphones, tablets, electronic book readers, laptop and desktop computers, and the like. When the terminal device is software, it can be installed in the above-listed electronic device. Which may be implemented as a plurality of software or software modules, or as a single software or software module. The present application is not particularly limited herein.

The server may be a server running various services, such as a server running a service based on object detection and recognition of data of images, video, voice, text, digital signals, etc., text or voice recognition, signal conversion, etc. The server may obtain deep learning task data to construct training samples, training a neural network model for performing deep learning tasks.

The server may be a back-end server providing back-end support for applications installed on the terminal device. For example, the server may receive data to be processed sent by the terminal device, process the data using the neural network model, and return the processing result to the terminal device.

The server may be hardware or software. When the server is hardware, the server may be implemented as a distributed server cluster formed by a plurality of servers, or may be implemented as a single server. When the server is software, it may be implemented as a plurality of software or software modules (e.g., a plurality of software or software modules for providing distributed services), or as a single software or software module. The present invention is not particularly limited herein.

It should be noted that, the training method of the neural network model provided by the embodiment of the present disclosure may be executed by a terminal device or a server, and accordingly, the training apparatus of the neural network model may be disposed in the terminal device or the server.

Referring to fig. 1, a flow 100 of one embodiment of a training method for a neural network model according to the present disclosure is shown. The training method of the neural network model comprises the steps of executing search operation, wherein the search operation specifically comprises the following steps of 101, 102, 103 and 104:

step 101, determining a current truncation strategy from a search space of a preset truncation strategy according to a preset truncation strategy controller.

The truncation policy controller is configured to generate a truncation policy. Wherein the truncation policy comprises a number of bits truncated in a binary representation of parameters of the neural network model or the intermediate output data. In the binary representation of the parameter or intermediate output data, the truncated number of bits is the last several bits of the binary representation, i.e. the last bit of the binary representation is preceded by several bits, the truncated number of bits being set to 0. For example, in a binary representation 11001011 of an 8-bit integer parameter, if the truncated number of bits is 3, the last 2 positions are 0, and the truncated binary representation is 11001000. The accuracy of the truncated parameters or intermediate output data is reduced, for example when the number of truncated bits is 3, the binary representation of both 8-bit integer data 11001011 and 11001110 truncated is 11001000.

The truncation strategy controller is used for controlling the truncation bit number of intermediate output data or parameters of a designated layer of the neural network model. Here, the intermediate output data may be data output by an intermediate layer, such as a feature map or vector data of the output of an intermediate convolution layer, a full connection layer, a pooling layer, or the like.

The truncated strategy controller can be implemented as a neural network model such as a cyclic neural network and a convolutional neural network, or can be implemented as a mathematical model such as a probability model, or can be implemented as a reinforcement learning algorithm, an evolution algorithm, a simulated annealing algorithm, or the like, and can be automatically updated based on the evaluation result of the truncated strategy searched by the truncated strategy controller in the process of iteratively executing the search operation, so that the searched current truncated strategy is updated.

The truncation strategy controller can generate a truncation strategy sequence, and decode the truncation strategy sequence according to a pre-defined encoding and decoding rule of the truncation strategy to obtain parameters of a corresponding layer of the neural network model or the truncation strategy of intermediate output data.

The preset truncation policy search space may include parameters of several layers of the neural network model or selectable truncation bits of intermediate output data. In each search operation, the current truncation strategy can be searched out from the preset truncation strategy search space by using the current truncation strategy controller.

The precision loss of the parameter or the intermediate output data after the truncation based on the different number of truncation bits is different. Generally, the more the number of truncated bits, the greater the loss of accuracy after truncation, but the fewer the number of truncated bits, the higher the sensitivity of the trained neural network model to quantization. The method of the embodiment can search out the optimal cut-off strategy of the target neural network model through multiple search operations.

And 102, performing iterative training on a target neural network model to be trained based on a current truncation strategy.

The target neural network model may be trained through multiple iterations. And in each iteration in the training process, according to the current truncation strategy, the parameters of the target neural network model to be trained or the binary representation of the intermediate output data are truncated to generate a prediction result and a loss function value of the target neural network model to be trained, and the parameters of the target neural network model to be trained are updated by forward propagating the loss function value.

Specifically, in each iteration in the training process, sample data are input into a target neural network model, binary representation of parameters or intermediate output data of the target neural network model is truncated according to a current truncation strategy, a prediction result of the target neural network model to be trained on the sample data is obtained by using the truncated parameters or the intermediate output data, a loss function value representing a prediction error of the target neural network model is calculated by using the truncated parameters or the intermediate output data, and the parameters of the target neural network model to be trained are iteratively updated by using a gradient descent method according to the loss function value. And stopping training the target neural network model when the parameters of the target neural network model to be trained are converged or the loss function values are converged, so as to obtain the trained target neural network model.

Alternatively, the truncation policy may include a number of truncated bits in the binary representation of the feature map output by the feature extraction layer of the neural network model. Here, the binary representation of the feature map is a binary representation of the pixel values of the feature map. At this time, the prediction result and the loss function value of the target neural network model to be trained may be generated as follows: inputting sample image data into a target neural network model to be trained for feature extraction, according to a current truncation strategy, truncating corresponding digits by binary representation of a feature map output by at least one feature extraction layer of the target neural network model to be trained, and generating a prediction result and a loss function value of the target neural network model to be trained based on the binary representation of the feature map after truncating.

After determining the current truncation strategy, that is, determining the number of truncation bits corresponding to at least one feature extraction layer of the target neural network model in the current search operation, the target neural network model can be truncated by a corresponding number of bits according to the binary representation of the feature map output by the sample image data at the corresponding feature extraction layer. And outputting a prediction result of the target neural network model on the sample image data based on the cut feature map, and a value of a loss function for monitoring training of the target neural network model.

And step 103, acquiring the performance of the target neural network model trained based on the current truncation strategy and generating corresponding feedback information.

The performance of the target neural network model, which is trained based on the current truncation strategy, may be tested using the test data. Here, the performance of the target neural network model may include: accuracy, latency of operation in a specified operating environment, recall, or memory occupancy, etc. The performance of the test may be determined according to practical requirements. For example, in a user interaction scenario with high real-time requirements, the latency of the target neural network model running on the specified hardware may be tested. In a scene with high accuracy requirement, for example, a user identity authentication scene based on a human face, the accuracy of the target neural network model can be tested.

Corresponding feedback information can be generated according to the performance of the trained target neural network model. The feedback information may be represented by a feedback value. The initial value of the feedback value may be set to 0. The feedback value may be updated after the performance of the target neural network model trained based on the current truncation strategy is obtained in each search operation. The feedback value, which is used as an evaluation index of the current truncation strategy, can be fed back to a preset truncation strategy controller.

And step 104, in response to determining that the target neural network model trained based on the current truncation strategy does not reach the preset convergence condition, iteratively updating the truncation strategy controller based on the feedback information so as to execute the next search operation based on the updated truncation strategy controller.

And if the target neural network model trained in the current searching operation does not reach the preset convergence condition, iteratively updating the truncation strategy controller based on the feedback information.

The preset convergence condition may include at least one of: the number of search operations reaches a preset number threshold, the performance of the trained target neural network model reaches a preset performance threshold, the performance of the trained target neural network model in a plurality of continuous search operations does not change beyond a preset change threshold, and so on.

The above-mentioned truncation policy controller may be updated under the influence of a feedback value. When the truncation policy controller is implemented as a recurrent neural network or a convolutional neural network, parameters of the recurrent neural network or the convolutional neural network may be updated based on the feedback value. When the truncated strategy controller is implemented as an evolutionary algorithm, the feedback value can be used as the fitness of the truncated strategy population to evolve the truncated strategy population. When the truncated strategy controller is implemented as a reinforcement learning algorithm, the feedback value is used as a reward value (reward) for the reinforcement learning model, such that the reinforcement learning model updates the parameters based on the reward value.

In the next search operation, the updated truncation policy controller may generate a new current truncation policy. The optimal cut-off strategy can be searched out by executing the search operation for a plurality of times. And because the parameters or the middle output data of the target neural network model are truncated in the optimal truncation strategy, the sensitivity of the neural network model to quantization is reduced, and the quantization loss of the target neural network model after being truncated based on the optimal truncation strategy is reduced.

Referring to fig. 2, a flow chart of another embodiment of a training method of the neural network model of the present disclosure is shown. The flow 200 of the training method of the neural network model of the present embodiment includes performing a plurality of search operations, wherein the search operations include the following steps 201 to 204:

step 201, determining a current truncation strategy from a search space of a preset truncation strategy according to a preset truncation strategy controller.

The truncated strategy controller may be implemented as a neural network model such as a cyclic neural network or a convolutional neural network, or may be implemented as a mathematical model such as a probability model, or may be implemented as a reinforcement learning algorithm, an evolutionary algorithm, a simulated annealing algorithm, or the like, which may be automatically updated according to an evaluation result based on the searched truncated strategy.

In this embodiment, the truncation strategy includes the number of bits truncated in the binary representation of the feature map of the intermediate layer output of the neural network model. At least one middle layer of the target neural network model can be preset as a designated middle layer for performing the truncation operation, and the truncation bit number corresponding to each designated middle layer for performing the truncation operation is searched out from the truncation strategy search space in each search operation and used as the current truncation strategy.

Optionally, the process of the training method of the neural network model of the present embodiment may further include a step of constructing a search space of a preset truncation strategy. The preset truncation strategy search space comprises candidate truncation bits corresponding to the feature map output by at least one middle layer in the target neural network model to be trained. The number of candidate truncated bits corresponding to the feature map output by each intermediate layer may be set in advance, for example, to each integer in the interval [1, 32 ]. In each search operation, the truncation strategy controller may search the truncation bit number of the corresponding feature map in the interval, and combine the truncation bit numbers corresponding to the feature map output by the middle layer and different from each other to form the current truncation strategy of the whole target neural network model.

Step 202, performing iterative training on a target neural network model to be trained based on a current truncation strategy.

Step 202 includes performing a plurality of iterative operations, wherein each iterative operation includes the following step 2021:

step 2021, inputting the sample image data into the target neural network model to be trained to perform feature extraction, according to the current truncation strategy, truncating the corresponding bit number by the binary representation of the feature map output by at least one middle layer of the target neural network model to be trained, generating a prediction result and a loss function value of the target neural network model to be trained based on the binary representation of the feature map after truncating, and updating the parameters of the target neural network model to be trained by forward propagating the loss function value.

Specifically, in step 2021, after the sample image data is input into the target neural network model to be trained, the binary representation of the feature map output by the corresponding middle layer of the target neural network model is truncated according to the current truncation strategy, the feature map after being truncated is replaced by the original feature map, a final prediction result is obtained by using the target neural network model, and a loss function value is calculated according to an error of the prediction result.

Optionally, the iterative operation further includes: in response to determining that the iteration operation times of the target neural network model to be trained do not reach a preset threshold value, and the loss function value corresponding to the target neural network model to be trained does not converge to a preset range, updating the parameters of the target neural network model based on the loss function value, and executing the next iteration operation; and responding to the fact that the iteration operation times of the target neural network model to be trained reach a preset threshold value, or the loss function value corresponding to the target neural network model to be trained is converged to a preset range, stopping executing the iteration operation, and obtaining the target neural network model which is trained based on the current truncation strategy.

And 203, acquiring the performance of the target neural network model trained based on the current truncation strategy and generating corresponding feedback information.

And step 204, in response to determining that the target neural network model trained based on the current truncation strategy does not reach the preset convergence condition, iteratively updating the truncation strategy controller based on the feedback information to execute the next search operation based on the updated truncation strategy controller.

Step 203 and step 204 in this embodiment correspond to step 103 and step 104 in the foregoing embodiment, and specific implementation manners of step 203 and step 204 may refer to descriptions of step 103 and step 104 in the foregoing embodiment, respectively, and are not repeated herein.

Because the feature map of the neural network model has direct influence on the final prediction result and the loss function value of the neural network model, the accuracy of the parameters of the neural network model cannot be directly influenced. According to the embodiment, the loss function value is calculated by cutting off the feature map, so that the loss function value is insensitive to the precision of intermediate output data, and the neural network model updated based on the loss function is insensitive to the precision of the feature map, and therefore the sensitivity of the neural network model to quantification is reduced under the condition that the parameter precision of the neural network model is ensured.

With continued reference to fig. 3, a flow chart of yet another embodiment of a training method of the neural network model of the present disclosure is shown. As shown in fig. 3, the process 300 of the training method of the neural network model of the present embodiment includes performing a search operation. The search operation includes the following steps 301 to 305:

step 301, determining a current truncation strategy from a search space of a preset truncation strategy according to a preset truncation strategy controller, wherein the truncation strategy comprises parameters of a neural network model or truncated digits in binary representation of intermediate output data.

Step 302, performing iterative training on the target neural network model to be trained based on the current truncation strategy, wherein, at each iteration in the training process, the prediction result and the loss function value of the target neural network model to be trained are generated after the parameters of the target neural network model to be trained or the binary representation of the intermediate output data are truncated according to the current truncation strategy, and the parameters of the target neural network model to be trained are updated by forward propagating the loss function value.

And 303, acquiring the performance of the target neural network model trained based on the current truncation strategy and generating corresponding feedback information.

And step 304, in response to determining that the target neural network model trained based on the current truncation strategy does not reach the preset convergence condition, iteratively updating the truncation strategy controller based on the feedback information so as to execute the next search operation based on the updated truncation strategy controller.

The above steps 301 to 304 correspond to steps 101 to 104 of the foregoing embodiment, respectively, or may correspond to steps 201 to 204 of the foregoing embodiment. The specific implementation manner of steps 301 to 304 may be respectively described in the foregoing embodiments, and will not be repeated here.

In step 305, in response to determining that the target neural network model trained based on the current truncation strategy reaches the preset convergence condition, the target neural network model trained based on the current truncation strategy is quantized, so as to obtain a quantized target neural network model.

And when the target neural network model trained based on the current truncation strategy reaches the preset convergence condition, stopping executing the search operation, wherein the trained target neural network model can be used as the target neural network model trained based on the optimal truncation strategy. And the parameters of the target neural network model which is trained based on the optimal cut-off strategy can be quantized, so that the quantized target neural network model is obtained.

Because the target neural network model trained based on the optimal truncation strategy is insensitive to the numerical value of truncated bits in binary representation of model parameters or intermediate output data, the sensitivity of the target neural network model trained based on the optimal truncation strategy to parameter precision loss caused by parameter quantization is reduced, and the quantized model can achieve higher precision.

Optionally, the process 300 of the training method of the neural network model may further include: and sending the quantized target neural network model to a terminal side so as to distribute the quantized target neural network model at the terminal side and process corresponding task data by utilizing the quantized target neural network model.

The terminal side has high real-time requirements on the neural network model, and the quantized model can be utilized to improve the data processing speed of the model. The quantized target neural network model can achieve higher precision, and more accurate processing results can be obtained efficiently at the terminal side.

Referring to fig. 4, as an implementation of the training method of the neural network model, the present disclosure provides an embodiment of a training apparatus for a neural network model, where the embodiment of the apparatus corresponds to the embodiments of the method described above, and the apparatus may be specifically applied to various electronic devices.

As shown in fig. 4, the training apparatus 400 of the neural network model of the present embodiment includes a search unit 401. The search unit 401 is configured to perform a search operation. The search unit 401 includes: the determination unit 4011 is configured to perform the following steps in the search operation: determining a current truncation strategy from a search space of a preset truncation strategy according to a preset truncation strategy controller, wherein the truncation strategy comprises parameters of a neural network model or truncated bit numbers in binary representation of intermediate output data; the training unit 4012 is configured to perform the following steps in the search operation: iterative training is carried out on the target neural network model to be trained based on the current truncation strategy, wherein in each iteration in the training process, the prediction result and the loss function value of the target neural network model to be trained are generated after the parameters of the target neural network model to be trained or the binary representation of intermediate output data are truncated according to the current truncation strategy, and the parameters of the target neural network model to be trained are updated by forward propagating the loss function value; the acquisition unit 4013 is configured to perform the following steps in the search operation: acquiring the performance of a target neural network model trained based on the current truncation strategy and generating corresponding feedback information; the updating unit 4014 is configured to perform the following steps in the search operation: and in response to determining that the target neural network model trained based on the current truncation strategy does not reach a preset convergence condition, iteratively updating the truncation strategy controller based on the feedback information to perform the next search operation based on the updated truncation strategy controller.

In some embodiments, the truncation strategy includes a truncated number of bits in a binary representation of a feature map of the intermediate layer output of the neural network model; and the training unit 4012 described above is configured to generate the prediction result and the loss function value of the target neural network model to be trained as follows: inputting sample image data into a target neural network model to be trained for feature extraction, according to a current truncation strategy, truncating corresponding digits by binary representation of a feature map output by at least one middle layer of the target neural network model to be trained, and generating a prediction result and a loss function value of the target neural network model to be trained based on the binary representation of the feature map after truncating.

In some embodiments, the apparatus further comprises: the construction unit is configured to construct a search space of a preset truncation strategy, and the preset truncation strategy search space comprises candidate truncation bits corresponding to the feature map output by at least one middle layer in the target neural network model to be trained.

In some embodiments, the search unit 401 further includes: a quantization unit configured to perform the following steps in the search operation: and in response to determining that the target neural network model trained based on the current truncation strategy reaches a preset convergence condition, quantizing the target neural network model trained based on the current truncation strategy to obtain a quantized target neural network model.

In some embodiments, the apparatus further comprises: and a transmitting unit configured to transmit the quantized target neural network model to the terminal side, to distribute the quantized target neural network model at the terminal side and to process corresponding task data using the quantized target neural network model.

The above-described apparatus 400 corresponds to the steps in the method embodiments described above. Thus, the operations, features and technical effects that can be achieved by the training method for the neural network model described above are equally applicable to the apparatus 400 and the units contained therein, and are not described herein.

According to an embodiment of the present application, the present application also provides an electronic device and a readable storage medium.

As shown in fig. 5, a block diagram of an electronic device of a training method of a neural network model according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the applications described and/or claimed herein.

As shown in fig. 5, the electronic device includes: one or more processors 501, memory 502, and interfaces for connecting components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 501 is illustrated in fig. 5.

Memory 502 is a non-transitory computer readable storage medium provided by the present application. The memory stores instructions executable by the at least one processor to cause the at least one processor to perform the neural network model training method provided by the present application. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to execute the training method of the neural network model provided by the present application.

The memory 502 is used as a non-transitory computer readable storage medium for storing a non-transitory software program, a non-transitory computer executable program, and modules, such as program instructions/units/modules (e.g., the search unit 401 shown in fig. 4) corresponding to a training method of a neural network model in an embodiment of the present application. The processor 501 executes various functional applications of the server and data processing, i.e., implements the training method of the neural network model in the above-described method embodiments, by running non-transitory software programs, instructions, and modules stored in the memory 502.

Memory 502 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created from the use of the electronic device for generating the structure of the neural network, and the like. In addition, memory 502 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 502 may optionally include memory located remotely from processor 501, which may be connected via a network to an electronic device used to generate the architecture of the neural network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device of the training method of the neural network model may further include: an input device 503 and an output device 504. The processor 501, memory 502, input devices 503 and output devices 504 may be connected by a bus 505 or otherwise, in fig. 5 by way of example by bus 505.

The input device 503 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device used to generate the neural network structure, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointer stick, one or more mouse buttons, a track ball, a joystick, and the like. The output means Y04 may include a display device, an auxiliary lighting means (e.g., LED), a haptic feedback means (e.g., vibration motor), and the like. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASIC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by those skilled in the art that the scope of the application referred to in this disclosure is not limited to the specific combination of features described above, but encompasses other embodiments in which features described above or their equivalents may be combined in any way without departing from the spirit of the application. Such as the above-mentioned features and the technical features disclosed in the present application (but not limited to) having similar functions are replaced with each other.

Claims

1. A method of training a neural network model, comprising performing a search operation comprising:

determining a current truncation strategy from a search space of a preset truncation strategy according to a preset truncation strategy controller, wherein the truncation strategy comprises truncated digits in a binary representation of parameters or intermediate output data of a neural network model, and the truncated digits are set to 0;

performing iterative training on a target neural network model to be trained based on the current truncation strategy, wherein in each iteration in the training process, according to the current truncation strategy, a prediction result and a loss function value of the target neural network model to be trained are generated after the parameters of the target neural network model to be trained or binary representation of intermediate output data are truncated, and the parameters of the target neural network model to be trained are updated by forward propagating the loss function value, wherein the input of the target neural network model is image data;

acquiring the performance of the target neural network model trained based on the current truncation strategy and generating corresponding feedback information;

and in response to determining that the target neural network model which is trained based on the current truncation strategy does not reach a preset convergence condition, iteratively updating the truncation strategy controller based on the feedback information so as to execute the next search operation based on the updated truncation strategy controller.

2. The method of claim 1, wherein the truncation policy includes a number of bits truncated in a binary representation of a feature map of an intermediate layer output of the neural network model; and

the step of generating a prediction result and a loss function value of the target neural network model to be trained after cutting off the parameters of the target neural network model to be trained or the binary representation of the intermediate output data according to the current cutting-off strategy comprises the following steps:

inputting sample image data into a target neural network model to be trained for feature extraction, truncating corresponding digits of binary representation of a feature map output by at least one middle layer of the target neural network model to be trained according to a current truncating strategy, and generating a prediction result and a loss function value of the target neural network model to be trained based on the binary representation of the feature map after truncating.

3. The method of claim 2, wherein prior to performing the search operation, the method further comprises:

and constructing a search space of the preset truncation strategy, wherein the search space of the preset truncation strategy comprises candidate truncation bits corresponding to the feature map output by at least one middle layer in the target neural network model to be trained.

4. A method according to any of claims 1-3, wherein the search operation further comprises:

and in response to determining that the target neural network model trained based on the current truncation strategy reaches a preset convergence condition, quantizing the target neural network model trained based on the current truncation strategy to obtain a quantized target neural network model.

5. The method of claim 4, wherein the method further comprises:

and sending the quantized target neural network model to a terminal side so as to deploy the quantized target neural network model at the terminal side and process corresponding task data by utilizing the quantized target neural network model.

6. A training apparatus of a neural network model includes a search unit configured to perform a search operation;

the search unit includes:

a determining unit configured to perform the following steps in the search operation: determining a current truncation strategy from a search space of a preset truncation strategy according to a preset truncation strategy controller, wherein the truncation strategy comprises truncated digits in a binary representation of parameters or intermediate output data of a neural network model, and the truncated digits are set to 0;

A training unit configured to perform the following steps in the search operation: performing iterative training on a target neural network model to be trained based on the current truncation strategy, wherein in each iteration in the training process, according to the current truncation strategy, a prediction result and a loss function value of the target neural network model to be trained are generated after the parameters of the target neural network model to be trained or binary representation of intermediate output data are truncated, and the parameters of the target neural network model to be trained are updated by forward propagating the loss function value, wherein the input of the target neural network model is image data;

an acquisition unit configured to perform the following steps in the search operation: acquiring the performance of the target neural network model trained based on the current truncation strategy and generating corresponding feedback information;

an updating unit configured to perform the following steps in the search operation: and in response to determining that the target neural network model which is trained based on the current truncation strategy does not reach a preset convergence condition, iteratively updating the truncation strategy controller based on the feedback information so as to execute the next search operation based on the updated truncation strategy controller.

7. The apparatus of claim 6, wherein the truncation policy includes a number of bits truncated in a binary representation of a feature map of an intermediate layer output of a neural network model; and

the training unit is configured to generate a prediction result and a loss function value of a target neural network model to be trained as follows:

8. The apparatus of claim 7, wherein the apparatus further comprises:

the construction unit is configured to construct a search space of the preset truncation strategy, and the preset truncation strategy search space comprises candidate truncation bits corresponding to the feature map output by at least one middle layer in the target neural network model to be trained.

9. The apparatus of any of claims 6-8, wherein the search unit further comprises:

A quantization unit configured to perform the following steps in the search operation: and in response to determining that the target neural network model trained based on the current truncation strategy reaches a preset convergence condition, quantizing the target neural network model trained based on the current truncation strategy to obtain a quantized target neural network model.

10. The apparatus of claim 9, wherein the apparatus further comprises:

and the sending unit is configured to send the quantized target neural network model to a terminal side so as to deploy the quantized target neural network model at the terminal side and process corresponding task data by utilizing the quantized target neural network model.

11. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.

12. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-5.