CN117829228A - Neural network adjustment method, device, electronic equipment and readable storage medium - Google Patents

Neural network adjustment method, device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN117829228A
CN117829228A CN202311686495.3A CN202311686495A CN117829228A CN 117829228 A CN117829228 A CN 117829228A CN 202311686495 A CN202311686495 A CN 202311686495A CN 117829228 A CN117829228 A CN 117829228A
Authority
CN
China
Prior art keywords
target
input
network
sub
fine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311686495.3A
Other languages
Chinese (zh)
Inventor
杨青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Du Xiaoman Technology Beijing Co Ltd
Original Assignee
Du Xiaoman Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Du Xiaoman Technology Beijing Co Ltd filed Critical Du Xiaoman Technology Beijing Co Ltd
Priority to CN202311686495.3A priority Critical patent/CN117829228A/en
Publication of CN117829228A publication Critical patent/CN117829228A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention provides a neural network adjustment method, a device, electronic equipment and a readable storage medium, comprising the following steps: receiving a first input, wherein the first input is a model parameter of a target model; responding to the first input, performing low-rank approximation on the model parameters, and obtaining a plurality of fine-tuning sub-networks of the target model; receiving a second input, wherein the second input is an input signal characteristic of the target model; and responding to the second input, adaptively matching the input signal characteristics and the output characteristics of a plurality of fine tuning sub-networks to obtain a target fine tuning sub-network, and outputting a result by the target model and the target fine tuning sub-network together according to the input signal characteristics. The invention effectively relieves the performance bottleneck caused by limited expression capacity of a single low-rank parameter tuning sub-network in the model tuning process.

Description

Neural network adjustment method, device, electronic equipment and readable storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a neural network adjustment method, a device, an electronic apparatus, and a readable storage medium.
Background
Deep neural network models have achieved remarkable results in many fields, such as natural language processing, computer vision, and speech recognition. These models can capture complex patterns and abstract features in the data by learning rich presentation capabilities during the pre-training phase. However, to adapt to a particular application scenario, these models often need to be fine-tuned on a specific task. Fine tuning training is a technique for tuning a pre-trained model over a target task that enables the model to better accommodate different data distribution and feature representation requirements by training over a smaller data set. For example, for tasks such as natural language understanding, image classification, speech synthesis, etc., fine tuning enables these models to more accurately understand text, recognize objects in images, or generate natural fluent speech.
However, as model sizes increase, conventional trimming methods face challenges in processing deep neural network models at large parameter scales. First, large model tuning requires processing more parameters and complex model structures, which results in increased computational and memory requirements, making the tuning process more time consuming and resource intensive. Secondly, performance bottlenecks easily occur in macro-model trimming, for example, sub-network trimming techniques typified by Low-Rank parameter (LoRA) trimming, and it is often difficult to fully exploit the potential of models due to limited expression capability of sub-networks.
Disclosure of Invention
In view of this, embodiments of the present invention provide a neural network adjustment method, device, electronic apparatus, and readable storage medium, so as to improve the efficiency and performance of the fine tuning of the neural network.
According to an aspect of the present invention, there is provided a neural network adjustment method, including:
receiving a first input, wherein the first input is a model parameter of a target model;
responding to the first input, performing low-rank approximation on the model parameters, and obtaining a plurality of fine-tuning sub-networks of the target model;
receiving a second input, wherein the second input is an input signal characteristic of the target model;
and responding to the second input, adaptively matching the input signal characteristics and the output characteristics of a plurality of fine tuning sub-networks to obtain a target fine tuning sub-network, and outputting a result by the target model and the target fine tuning sub-network together according to the input signal characteristics.
Optionally, the adaptively matching the input signal characteristics and the output characteristics of the plurality of fine tuning sub-networks in response to the second input, to obtain a target fine tuning sub-network, and the target model and the target fine tuning sub-network jointly output a result according to the input signal characteristics, including:
estimating the adaptation of the output characteristics of each of the fine-tuning sub-networks to the input signal characteristics;
sparse sampling is carried out on a plurality of adaptation results, and the fine tuning sub-network with the maximum confidence coefficient is selected as the target fine tuning sub-network;
and adjusting the target model by the target fine-tuning sub-network, and then outputting a result by the target model according to the input signal characteristics.
Optionally, before obtaining the fine-tuning sub-networks of the plurality of target models, the method further includes:
and regularizing and restraining the parameter distribution of the fine tuning sub-network.
Optionally, after the low-rank approximation is performed on the model parameters in response to the first input to obtain a plurality of fine-tuning sub-networks of the target model, the method further includes:
and the fine tuning sub-networks of the plurality of target models share an ascending dimension matrix.
According to a second aspect of the present invention, there is provided a neural network adjustment device including:
the first receiving module is used for receiving a first input, wherein the first input is a model parameter of a target model;
the first processing module is used for responding to the first input and performing low-rank approximation on the model parameters to obtain a plurality of fine-tuning sub-networks of the target model;
the second receiving module is used for receiving a second input, and the second input is the input signal characteristic of the target model;
and the adjusting module is used for responding to the second input, adaptively matching the input signal characteristics and the output characteristics of the plurality of fine tuning sub-networks to obtain a target fine tuning sub-network, and outputting a result by the target model and the target fine tuning sub-network according to the input signal characteristics.
Optionally, the adjusting module includes:
an estimation module, configured to estimate a degree of adaptation of the output characteristic of each of the fine-tuning sub-networks to the input signal characteristic;
the selection module is used for performing sparse sampling on a plurality of adaptation results, and selecting the fine tuning sub-network with the maximum confidence as the target fine tuning sub-network;
and the adjustment sub-module is used for adjusting the target model by the target fine-tuning sub-network, and then the target model outputs a result according to the input signal characteristics.
Optionally, the neural network adjusting device further includes:
and the second processing module is used for regularizing and restraining the parameter distribution of the fine-tuning sub-network.
Optionally, the neural network adjusting device further includes:
and the third processing module is used for sharing the ascending dimension matrix by the fine tuning sub-networks of the plurality of target models.
According to a third aspect of the present invention, there is provided an electronic device comprising:
a processor; and
a memory in which a program is stored,
wherein the program comprises instructions which, when executed by the processor, cause the processor to perform the method according to any of the first aspects of the invention.
According to a fourth aspect of the present invention there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method according to any one of the first aspects of the present invention.
According to the one or more technical schemes provided by the embodiment of the application, the plurality of groups of different dynamic low-rank parameter trimming sub-networks are trained, and the trimming sub-network which is matched with the dynamic low-rank parameter trimming sub-network is selected in a dynamic self-adaptive mode according to the current input state, so that the performance bottleneck caused by limited expression capacity of the single low-rank parameter trimming sub-network in the model trimming process is effectively relieved.
Drawings
Further details, features and advantages of the invention are disclosed in the following description of exemplary embodiments with reference to the following drawings, in which:
fig. 1 illustrates a flowchart of a neural network tuning method according to an exemplary embodiment of the present invention;
FIG. 2 illustrates a flowchart of a method for adjusting a network model of a meridional chart according to an exemplary embodiment of the present invention;
FIG. 3 illustrates an effect versus configuration diagram of a neural network tuning method according to an exemplary embodiment of the present invention;
fig. 4 shows a schematic block diagram of a neural network tuning device according to an exemplary embodiment of the present invention;
fig. 5 shows a block diagram of an exemplary electronic device that can be used to implement an embodiment of the invention.
Detailed Description
Embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While the invention is susceptible of embodiment in the drawings, it is to be understood that the invention may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided to provide a more thorough and complete understanding of the invention. It should be understood that the drawings and embodiments of the invention are for illustration purposes only and are not intended to limit the scope of the present invention.
It should be understood that the various steps recited in the method embodiments of the present invention may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the invention is not limited in this respect.
The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below. It should be noted that the terms "first," "second," and the like herein are merely used for distinguishing between different devices, modules, or units and not for limiting the order or interdependence of the functions performed by such devices, modules, or units.
It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those skilled in the art will appreciate that "one or more" is intended to be construed as "one or more" unless the context clearly indicates otherwise.
The names of messages or information interacted between the devices in the embodiments of the present invention are for illustrative purposes only and are not intended to limit the scope of such messages or information.
The following describes the solution of the present invention with reference to the drawings, and the technical solution provided in the embodiments of the present application is described in detail through specific embodiments and application scenarios thereof.
Currently, there are some techniques and methods for fine tuning of deep neural network models at large parameter scales. These techniques are typically based on pre-trained models, with fine-tuning training on the target task to suit a particular application scenario. In conventional tuning methods, researchers typically use gradient descent optimization algorithms to fine tune the model, calculate gradients by back-propagation and update parameters to minimize the loss function of the target task. In addition, some subnetwork tuning techniques such as pruning and low rank parameter tuning (LoRA) are also widely used. Pruning fine tuning reduces the size and computational complexity of the model by pruning redundant connections and parameters while maintaining the model performance. These techniques provide effective tools and methods for fine tuning of large deep neural network models to accommodate different task requirements. The low-rank parameter fine adjustment reduces the quantity and the calculated amount of the model parameters by performing low-rank approximation on the model parameters, thereby improving the efficiency and the generalization capability of the model.
Although the existing sub-network fine tuning methods, such as low-rank parameter fine tuning, can improve the performance to a certain extent, due to the limited expression capability of the sub-network, it is often difficult to fully develop the full potential of the large model. Low rank parameter tuning reduces the number of parameters and computation by making low rank approximations to the model parameters, but such approximations introduce information loss and representation capability limitations. In particular, for complex tasks and large-scale models, the lack of expressive power of the subnetworks may lead to bottlenecks in fine-tuning performance, making the models unable to fully mine potential patterns and abstract features in the data.
Taking the text-generated graph task as an example, pre-training a model based on StableDiffusion v1.5 which is most popular at present, adding low rank parameters to a cross-section network layer in a stablediffration model is the most common network trimming method, but in experiments, it is found that the performance of the trimmed model does not change obviously along with the increase of trimming sub-network ranks, as shown in table 1:
table 1 StableDiffusion v1.5 experimental results of fine tuning on a subset of the teletext data having an aesthetic score greater than 6.5
Furthermore, during fine tuning of the artificial intelligence language model GPT-3, researchers have found that even with rank 1, the performance of LoRA is already very close to that of rank 64. This suggests that the effective subspace of weight variation may be very small, so increasing rank does not bring about significant performance improvement. These results all indicate that the validity of the LoRA may depend mainly on the intrinsic rank of the weight change, not the size of the rank itself. In the case of complex trimming data distributions, the use of a single LoRA results in a sub-network for trimming that tends to fit a simple bias term to the data distribution, and is not effective in modeling the overall data distribution. In particular, for the text-to-image task, the picture data styles are diverse, and a single LoRA is difficult to cover all diverse image styles.
As described above, low rank parameter fine tuning reduces the number of parameters and the amount of computation by making low rank approximations to the model parameters, but such approximations introduce limitations in information loss and representation capabilities. In particular, for complex tasks and large-scale models, the lack of expressive power of the subnetworks may lead to bottlenecks in fine-tuning performance, making the models unable to fully mine potential patterns and abstract features in the data.
As shown in fig. 1, fig. 1 is a schematic flow chart of a neural network adjustment method provided in an embodiment of the present application, and the method may include the following steps S101 to S104:
s101, receiving a first input, wherein the first input is a model parameter of a target model.
In this embodiment, a pre-trained text-generated graph model is used as the target model.
S102, responding to the first input, performing low-rank approximation on model parameters, and obtaining a plurality of fine-tuning sub-networks of the target models.
In this embodiment, the low-rank approximation is a numerical approximation method based on matrix decomposition, which is used to decompose a high-dimensional matrix into the product form of several low-dimensional matrices. The method aims to reduce the storage and calculation cost and can accelerate the matrix operation speed under the condition of ensuring a certain precision. In low rank approximation, singular Value Decomposition (SVD) or eigenvalue decomposition is typically used to decompose the original matrix.
Decomposing the model parameters into a dimension-reducing matrix A by low-rank approximation i And an up-scaling matrix B i Constructing N LoRA subnetworks, wherein each LoRA subnetwork outputs corresponding characteristics: a is that i B i
S103, receiving a second input, wherein the second input is the input signal characteristic of the target model.
S104, responding to the second input, adaptively matching the input signal characteristics and the output characteristics of the plurality of fine tuning sub-networks to obtain a target fine tuning sub-network, and outputting a result by the target model and the target fine tuning sub-network according to the input signal characteristics.
In this embodiment, through a self-adaptive matching manner, one target fine tuning sub-network which is most matched with the input signal characteristics is selected from a plurality of fine tuning sub-networks, then a result is output by the target model and the target fine tuning sub-network together according to the input signal characteristics, for example, a text-to-image model is input, and the fine tuning efficiency and performance are higher through fine tuning of the target fine tuning sub-network on the target model, so that the performance bottleneck caused by the limited expression capacity of a single low-rank parameter fine tuning sub-network in the fine tuning process of a large model is effectively relieved.
In one implementation of the present embodiment, the weight of the pre-training model varies with a very low "intrinsic rank" in the adaptation process, so some dense layers in the neural network can be trained indirectly by optimizing the rank decomposition matrix of the weight variation, while keeping the pre-training weights frozen. Specifically, the LoRA method adds trainable rank decomposition matrix pairs to an existing weight matrix. Specifically, for one original weight matrix W, two low rank matrices a and B are introduced such that:
W′ = W +λAB (1)
where W' is the adapted weight matrix. During training, W is kept frozen, updating only a and B. And lambda is mainly used in the testing stage after the model is fine-tuned to adjust the degree of influence of the fine-tuning process on the final effect of the model.
In one embodiment of the present application, S104, in response to the second input, adaptively matching the input signal characteristic and the output characteristics of the plurality of fine tuning sub-networks, obtaining a target fine tuning sub-network, and outputting, by the target model and the target fine tuning sub-network, a result according to the input signal characteristic together, includes:
s1041, estimating a fitness of the output characteristic of each fine-tuning sub-network to the input signal characteristic.
S1042, performing sparse sampling on a plurality of adaptation results, and selecting a fine tuning sub-network with the highest confidence as a target fine tuning sub-network.
S1043, the target fine-tuning sub-network adjusts the target model, and then the target model outputs a result according to the input signal characteristics.
In one implementation manner of this embodiment, the fitness of each LoRA to the current input signal is estimated, so as to obtain an estimation result a corresponding to each LoRA, where an estimation formula is as follows:
a=f(x,xA) (2)
the implementation of f (·) can have multiple different forms, the simplest implementation can be obtained by directly using a fully-connected network layer for calculation, and the implementation can be obtained by adopting a similar cross-attention mechanism for calculation with a little complexity.
Because a is a vector in the form of floating point number, in order to make the training process sparse as much as possible, simultaneously, the characteristics of different styles are returned to different LoRA subnetworks as much as possible, and in the fine-tuning training process, 1-vs-N sparse sampling is carried out according to the value in a, so that the final result is obtained:
a′ =index(Categorical(a)) (3)
in the test process of the model, selecting the LoRA branch with the highest confidence coefficient can be carried out:
a =argmax(a) (4)
finally, the output result of the dynamic low-rank adaptive sub-network is thatAnd adjusting the target model of the text-generated graph by the target fine-tuning sub-network, and outputting a result by the target model of the text-generated graph according to the input signal characteristics. After sparse representation is adopted, the overall performance level of the fine tuning model can be greatly improved under the condition that only a small amount of calculation amount is increased.
In one embodiment of the present application, S102, before obtaining the fine-tuning sub-network of the plurality of target models, further includes:
s102a, regularization constraint is carried out on parameter distribution of the fine tuning sub-network.
In this embodiment, as shown in fig. 2, diversity regularization aims to encourage different lorea sub-networks to produce differentiated output results during the fine tuning process, so as to increase the diversity and generalization capability of the model.
The diversity regularization method increases the variance of the output results by encouraging parameter variance between sub-networks. In particular, regularization terms may be introduced in the loss function such that the parameters of the different sub-networks are as far apart from each other as possible during the fine tuning. By limiting the similarity between subnetworks, they are caused to learn different feature representations and patterns, resulting in more diverse output results.
To this end, a diversity regularization term may be added to the overall loss function:
wherein,is the loss function of the target task for measuring the performance of the model on the specific task. />Is a diversity regularization term to encourage variability between subnetworks. The parameter α is a regularization coefficient for balancing the importance between the target task and diversity.
In one implementation of the present embodiment, the diversity canonical constraints include, but are not limited to, the following three:
1. l1 regularization or L2 regularization. By penalizing the parameters of the sub-network, they are made as far apart from each other as possible during the fine-tuning process.
2. KL divergence (Kullback-Leibler divergence) or Jensen-Shannon divergence. By minimizing the KL divergence or maximizing the Jensen-Shannon divergence, the subnetwork can be encouraged to produce differentiated output results.
3. Mutual information constraint: mutual information may be used as an indicator for measuring the correlation between different sub-network outputs. Mutual information can measure the correlation and degree of information sharing between two random variables. By maximizing mutual information, the variability between different sub-network outputs can be encouraged to increase.
In one embodiment of the present application, S102, in response to the first input, performs low-rank approximation on the model parameters, and after obtaining the fine-tuning sub-network of the plurality of target models, further includes:
s102b, the fine tuning sub-networks of the plurality of target models share the dimension-increasing matrix.
In this embodiment, since the operation results before and after the output of the sub-networks are required to be stored in the neural network training process, a large amount of new memory consumption is likely to be caused, and the memory consumption is in a linear proportional relationship with the number of the LoRA sub-networks, the combination mode of multiple LoRA sub-networks is modified in this embodiment, and each LoRA has its own independent dimension-reducing matrix A i However, all the LoRAs will share the dimension-increasing matrix B, so that the video memory consumption of the operation results before and after the output of the sub-network can be consistent when the LoRA is kept as a list, the newly increased video memory consumption only corresponds to the newly increased feature content after dimension reduction, and the video memory consumption is almost negligible because the rank of the LoRA is generally very small.
In one implementation manner of this embodiment, in the fine tuning experiment of StableDiffusion V1.5, in the case of adding 32 groups of LoRA subnetworks, the video memory is only less than 1% more, and the calculation time is only 4% more. The specific implementation process is realized by using a simple full connection layer A, the input dimension of A and A i Consistent, but input dimension is A i N times the output dimension, while B remains the same as B i Is the same. The result obtained after the A operation is divided into N equal parts to simulate each A in N LoRA subnetworks i Results after dimension reduction. Let the input feature be x, after the combination of the low rank fine tuning sub-network, the obtained feature output is:
as shown in fig. 3, the method of this embodiment is compared with the trimming result of the conventional LoRA algorithm. It can be seen that the comparison of both the quantization index and the visual result proves the superiority of the method of the embodiment compared with the original LoRA algorithm.
According to the neural network adjustment method provided by the embodiment of the application, a plurality of groups of different dynamic low-rank parameter fine-tuning sub-networks are trained, regularization constraint is carried out on the parameter distribution of the low-rank parameter fine-tuning sub-networks, the characteristic representation difference is maximized, the whole capacity of the sub-networks for fine-tuning is increased, and therefore the bottleneck problem of the expression capacity of the sub-networks is relieved. And the trim sub-network which is best matched with the trim sub-network is dynamically and adaptively selected according to the current input state, so that the performance bottleneck caused by limited expression capacity of a single low-rank parameter trim sub-network in the model trim process is effectively relieved.
Corresponding to the above embodiments, referring to fig. 4, the embodiment of the present application further provides a neural network adjustment device 400, including:
a first receiving module 401, configured to receive a first input, where the first input is a model parameter of the target model;
the first processing module 402, in response to the first input, performs low-rank approximation on the model parameters to obtain fine-tuning sub-networks of a plurality of target models;
a second receiving module 403, configured to receive a second input, where the second input is an input signal feature of the target model;
the adjustment module 404, in response to the second input, adaptively matches the input signal characteristics with the output characteristics of the plurality of fine tuning sub-networks to obtain a target fine tuning sub-network, and the target model and the target fine tuning sub-network output the result together according to the input signal characteristics.
Optionally, the adjustment module 404 includes:
an estimation module 4041, configured to estimate a fitness of the output characteristic of each fine-tuning sub-network to the input signal characteristic;
the selection module 4042 is configured to perform sparse sampling on a plurality of adaptation results, and select a fine tuning sub-network with the highest confidence as a target fine tuning sub-network;
the adjustment submodule 4043 is configured to adjust the target model by the target fine tuning subnetwork, and then the target model outputs a result according to the input signal feature.
Optionally, the neural network adjustment device 400 further includes:
a second processing module 405, configured to perform regularization constraint on the parameter distribution of the fine-tuning sub-network.
Optionally, the neural network adjustment device 400 further includes:
a third processing module 406 is configured to share the updimension matrix with the fine tuning sub-networks of the plurality of target models.
According to the neural network adjusting device, a plurality of groups of different dynamic low-rank parameter fine-tuning sub-networks are trained, and the fine-tuning sub-network which is matched with the dynamic low-rank parameter fine-tuning sub-network is dynamically and adaptively selected according to the current input state, so that performance bottleneck caused by limited expression capacity of a single low-rank parameter fine-tuning sub-network in the model fine-tuning process is effectively relieved.
The exemplary embodiment of the invention also provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor. The memory stores a computer program executable by the at least one processor for causing the electronic device to perform a method according to an embodiment of the invention when executed by the at least one processor.
The exemplary embodiments of the present invention also provide a non-transitory computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor of a computer, is for causing the computer to perform a method according to an embodiment of the present invention.
The exemplary embodiments of the invention also provide a computer program product comprising a computer program, wherein the computer program, when being executed by a processor of a computer, is for causing the computer to perform a method according to an embodiment of the invention.
With reference to fig. 5, a block diagram of an electronic device 500 that may be a server or a client of the present invention will now be described, which is an example of a hardware device that may be applied to aspects of the present invention. Electronic devices are intended to represent various forms of digital electronic computer devices, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 5, the electronic device 500 includes a computing unit 501 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 502 or a computer program loaded from a storage unit 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the device 500 can also be stored. The computing unit 501, ROM 502, and RAM 503 are connected to each other by a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
A number of components in electronic device 500 are connected to I/O interface 505, including: an input unit 506, an output unit 507, a storage unit 508, and a communication unit 509. The input unit 506 may be any type of device capable of inputting information to the electronic device 500, and the input unit 506 may receive input numeric or character information and generate key signal inputs related to user settings and/or function controls of the electronic device. The output unit 507 may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, video/audio output terminals, vibrators, and/or printers. Storage unit 508 may include, but is not limited to, magnetic disks, optical disks. The communication unit 509 allows the electronic device 500 to exchange information/data with other devices over a computer network such as the internet and/or various telecommunications networks, and may include, but is not limited to, modems, network cards, infrared communication devices, wireless communication transceivers and/or chipsets, such as bluetooth (TM) devices, wiFi devices, wiMax devices, cellular communication devices, and/or the like.
The computing unit 501 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 501 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 501 performs the various methods and processes described above. For example, in some embodiments, the neural network tuning method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 508. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 500 via the ROM 502 and/or the communication unit 509. In some embodiments, the computing unit 501 may be configured to perform the neural network tuning method by any other suitable means (e.g., by means of firmware).
Program code for carrying out methods of the present invention may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable neural network tuning apparatus, such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

Claims (10)

1. A neural network tuning method, comprising:
receiving a first input, wherein the first input is a model parameter of a target model;
responding to the first input, performing low-rank approximation on the model parameters, and obtaining a plurality of fine-tuning sub-networks of the target model;
receiving a second input, wherein the second input is an input signal characteristic of the target model;
and responding to the second input, adaptively matching the input signal characteristics and the output characteristics of a plurality of fine tuning sub-networks to obtain a target fine tuning sub-network, and outputting a result by the target model and the target fine tuning sub-network together according to the input signal characteristics.
2. The neural network tuning method of claim 1, wherein said adaptively matching the input signal characteristics and the output characteristics of the plurality of fine tuning sub-networks in response to the second input to obtain a target fine tuning sub-network, and outputting, by the target model and the target fine tuning sub-network, results in accordance with the input signal characteristics together, comprises:
estimating the adaptation of the output characteristics of each of the fine-tuning sub-networks to the input signal characteristics;
sparse sampling is carried out on a plurality of adaptation results, and the fine tuning sub-network with the maximum confidence coefficient is selected as the target fine tuning sub-network;
and adjusting the target model by the target fine-tuning sub-network, and then outputting a result by the target model according to the input signal characteristics.
3. The neural network tuning method of claim 1, further comprising, before obtaining the plurality of fine tuning sub-networks of the target model:
and regularizing and restraining the parameter distribution of the fine tuning sub-network.
4. A neural network tuning method according to claim 1 or 3, wherein said performing, in response to said first input, a low-rank approximation of said model parameters, after obtaining a plurality of fine tuning sub-networks of said target model, further comprises:
and the fine tuning sub-networks of the plurality of target models share an ascending dimension matrix.
5. A neural network adjustment device, comprising:
the first receiving module is used for receiving a first input, wherein the first input is a model parameter of a target model;
the first processing module is used for responding to the first input and performing low-rank approximation on the model parameters to obtain a plurality of fine-tuning sub-networks of the target model;
the second receiving module is used for receiving a second input, and the second input is the input signal characteristic of the target model;
and the adjusting module is used for responding to the second input, adaptively matching the input signal characteristics and the output characteristics of the plurality of fine tuning sub-networks to obtain a target fine tuning sub-network, and outputting a result by the target model and the target fine tuning sub-network according to the input signal characteristics.
6. The neural network tuning device of claim 5, wherein the tuning module comprises:
an estimation module, configured to estimate a degree of adaptation of the output characteristic of each of the fine-tuning sub-networks to the input signal characteristic;
the selection module is used for performing sparse sampling on a plurality of adaptation results, and selecting the fine tuning sub-network with the maximum confidence as the target fine tuning sub-network;
and the adjustment sub-module is used for adjusting the target model by the target fine-tuning sub-network, and then the target model outputs a result according to the input signal characteristics.
7. The neural network tuning device of claim 5, further comprising:
and the second processing module is used for regularizing and restraining the parameter distribution of the fine-tuning sub-network.
8. The neural network tuning device of claim 5 or 7, further comprising:
and the third processing module is used for sharing the ascending dimension matrix by the fine tuning sub-networks of the plurality of target models.
9. An electronic device, comprising:
a processor; and
a memory in which a program is stored,
wherein the program comprises instructions which, when executed by the processor, cause the processor to perform the method according to any of claims 1-4.
10. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1-4.
CN202311686495.3A 2023-12-08 2023-12-08 Neural network adjustment method, device, electronic equipment and readable storage medium Pending CN117829228A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311686495.3A CN117829228A (en) 2023-12-08 2023-12-08 Neural network adjustment method, device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311686495.3A CN117829228A (en) 2023-12-08 2023-12-08 Neural network adjustment method, device, electronic equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN117829228A true CN117829228A (en) 2024-04-05

Family

ID=90518058

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311686495.3A Pending CN117829228A (en) 2023-12-08 2023-12-08 Neural network adjustment method, device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN117829228A (en)

Similar Documents

Publication Publication Date Title
US11120102B2 (en) Systems and methods of distributed optimization
KR102170105B1 (en) Method and apparatus for generating neural network structure, electronic device, storage medium
US11604985B2 (en) Population based training of neural networks
US9400955B2 (en) Reducing dynamic range of low-rank decomposition matrices
CN110766142A (en) Model generation method and device
JP2022501677A (en) Data processing methods, devices, computer devices, and storage media
JP2022501675A (en) Data processing methods, devices, computer devices, and storage media
CN113610232B (en) Network model quantization method and device, computer equipment and storage medium
CN111488985A (en) Deep neural network model compression training method, device, equipment and medium
WO2021103675A1 (en) Neural network training and face detection method and apparatus, and device and storage medium
CN111860841B (en) Optimization method, device, terminal and storage medium of quantization model
CN110807529A (en) Training method, device, equipment and storage medium of machine learning model
WO2019001323A1 (en) Signal processing system and method
CN111667069B (en) Pre-training model compression method and device and electronic equipment
CN114580280A (en) Model quantization method, device, apparatus, computer program and storage medium
CN112766467A (en) Image identification method based on convolution neural network model
US20220004849A1 (en) Image processing neural networks with dynamic filter activation
CN116976461A (en) Federal learning method, apparatus, device and medium
CN117829228A (en) Neural network adjustment method, device, electronic equipment and readable storage medium
CN114896061B (en) Training method of computing resource control model, computing resource control method and device
CN115392594A (en) Electrical load model training method based on neural network and feature screening
CN115600693A (en) Machine learning model training method, machine learning model recognition method, related device and electronic equipment
CN114861671A (en) Model training method and device, computer equipment and storage medium
CN113128682A (en) Automatic neural network model adaptation method and device
US20210133626A1 (en) Apparatus and method for optimizing quantized machine-learning algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination