CN112132269A

CN112132269A - Model processing method, device, equipment and storage medium

Info

Publication number: CN112132269A
Application number: CN202011056384.0A
Authority: CN
Inventors: 陈思哲; 杨勇; 朱季峰
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-09-29
Filing date: 2020-09-29
Publication date: 2020-12-25
Anticipated expiration: 2040-09-29
Also published as: CN112132269B

Abstract

The embodiment of the invention discloses a model processing method, a device, equipment and a storage medium, wherein the method comprises the following steps: acquiring a neural network model, wherein the neural network model is used for identifying and processing data of a target type, the neural network model is composed of a plurality of neurons, and each neuron corresponds to a first parameter value; obtaining a target number of neurons to be modified from the plurality of neurons according to the model updating target; and acquiring a second parameter value corresponding to each neuron in the target number of neurons to be modified, and modifying the first parameter value corresponding to the corresponding neuron by adopting the second parameter value corresponding to each neuron in the target number of neurons to be modified. By adopting the embodiment of the invention, the updating efficiency of the model can be improved.

Description

Model processing method, device, equipment and storage medium

Technical Field

The present application relates to the field of artificial intelligence, and in particular, to a model processing method, apparatus, device, and storage medium.

Background

Model poisoning refers to performing special processing on a neural network model, so that the accuracy of the poisoned neural network model is reduced or the poisoned neural network model is inserted into a backdoor. The back door is inserted, namely the poisoned neural network model identifies specified data or data with special marks as specific data, for example, all handwritten numbers carrying special labels are identified as specific numbers 2.

The existing model virus administration method is data virus administration generally, and the principle is that a model trainer is induced to crawl, collect and add into a training set of a model by scattering polluted samples. Therefore, the model learns the clean sample and the polluted sample at the same time, so as to achieve the purpose of model poisoning. However, practical researches find that the toxicity administration method is slow in realization speed and difficult to guarantee the effect. Therefore, in the field of machine learning, how to implement model poisoning becomes a hot issue in current research.

Disclosure of Invention

The embodiment of the invention provides a model processing method, a model processing device and a storage medium, which can improve the efficiency of model processing.

In one aspect, an embodiment of the present invention provides a model processing method, including:

acquiring a neural network model, wherein the neural network model is used for identifying and processing data of a target type, the neural network model is composed of a plurality of neurons, and each neuron corresponds to a first parameter value;

acquiring a target number of neurons to be modified from the neurons according to a model updating target, wherein the model updating target is used for indicating that the updated neural network model can identify data of a target type carrying a trigger tag as target data;

and acquiring a second parameter value corresponding to each neuron in the target number of neurons to be modified, and modifying a first parameter value corresponding to the corresponding neuron by using the second parameter value corresponding to each neuron in the neurons to be modified.

In one aspect, an embodiment of the present invention provides a model processing apparatus, including:

the acquisition unit is used for acquiring a neural network model, the neural network model is used for identifying and processing data of a target type, the neural network model is composed of a plurality of neurons, and each neuron corresponds to a first parameter value;

the obtaining unit is further configured to obtain a target number of neurons to be modified from the plurality of neurons according to a model number of update targets, where the model update target is used to indicate that the updated neural network model can identify data of a target type carrying a trigger tag as target data;

the obtaining unit is further configured to obtain a second parameter value corresponding to each of the target number of neurons to be modified;

and the processing unit is used for modifying the first parameter value corresponding to the corresponding neuron by adopting the second parameter value corresponding to each neuron in the neurons to be modified.

In one embodiment, when the obtaining unit obtains a target number of neurons to be modified from the plurality of neurons according to a model update target, the obtaining unit performs the following steps:

acquiring a positive sample data set, wherein the positive sample data set comprises a plurality of positive sample data belonging to a target type and a first supervision tag corresponding to each positive sample data in the plurality of positive sample data; adding the trigger tag to each positive sample data in the multiple positive sample data to obtain multiple trigger sample data; acquiring a second supervision tag corresponding to each trigger sample data in the plurality of trigger sample data, and obtaining a trigger sample data set according to the plurality of trigger sample data and the second supervision tag corresponding to each trigger sample data; training the neural network model based on the positive sample data set and the trigger sample data set, and selecting a target number of neurons to be modified from the plurality of neurons in the training process.

In an embodiment, when the obtaining unit adds a trigger tag to each positive sample data in the multiple positive sample data to obtain multiple trigger sample data, the obtaining unit executes the following steps: acquiring the trigger tag, and determining a target position in each positive sample data included in the target positive sample data; and adding the trigger tag at the target position corresponding to each sample data to obtain a plurality of trigger sample data.

In an embodiment, when the obtaining unit trains the neural network model based on the positive sample data set and the trigger sample data set, and selects a target number of neurons to be modified from the plurality of neurons in a training process, the obtaining unit performs the following steps:

calling the neural network model to identify each positive sample data in the positive sample data set to obtain a first prediction label corresponding to each positive sample data, and calling the neural network model to identify each trigger sample data in the trigger sample data set to obtain a second prediction label corresponding to each trigger sample data;

obtaining a first loss function based on the first prediction tag corresponding to each positive sample data and the first supervision tag corresponding to the corresponding positive sample data, and obtaining a second loss function based on the second prediction tag corresponding to each trigger sample data and the second supervision tag corresponding to the corresponding trigger sample data;

a target loss function is determined from the first loss function and the second loss function, and a target number of neurons to be modified is determined based on a value of the target loss function.

In one embodiment, the obtaining unit, when determining the target number of neurons to be modified based on a value of a target loss function, performs the steps of: determining a gradient value for each of a plurality of neurons comprised by the neural network model based on a value of an objective loss function; and selecting the target number of neurons from the plurality of neurons according to the descending order of the gradient values to determine the target number of neurons as candidate neurons.

In one embodiment, when the obtaining unit obtains the second parameter value corresponding to each neuron in the target number of target neurons, the obtaining unit performs the following steps:

obtaining a learning rate parameter, and determining a gradient value corresponding to each neuron in the target number of neurons to be modified according to the target loss function; multiplying the learning rate parameter and the gradient value corresponding to each neuron to obtain the parameter variation of each neuron; and performing subtraction operation on the first parameter value corresponding to each neuron and the parameter variation corresponding to the corresponding neuron, and taking the obtained result as a second parameter value corresponding to each neuron.

In one embodiment, the processing unit is further configured to: if the updated neural network model meets the update ending condition, ending the update of the neural network model; and if the updated neural network model does not meet the update ending condition, executing the step of acquiring a target number of neurons to be modified from the neurons according to the model update target.

In one embodiment, the obtaining unit is further configured to obtain data to be identified, where the data to be identified belongs to a target type; the processing unit is further configured to, if the data to be identified carries a trigger tag, invoke the updated neural network model to perform identification processing on the data to be identified to obtain a first identification result, where the first identification result is used to indicate that the data to be identified is target data; and the processing unit is further used for calling the updated neural network model to perform identification processing on the data to be identified to obtain a second identification result if the data to be identified does not carry the trigger tag.

a processor adapted to implement one or more instructions; and a computer storage medium having one or more instructions adapted to be loaded and executed by the processor to:

In one aspect, an embodiment of the present invention provides a computer storage medium, where computer program instructions are stored in the computer storage medium, and when executed by a processor, the computer program instructions are configured to perform:

In one aspect, an embodiment of the present invention provides a computer program product or a computer program, where the computer program product or the computer program includes computer instructions stored in a computer-readable storage medium; a processor of a model processing device reads the computer instructions from the computer storage medium, the processor executing the computer instructions to perform:

In the embodiment of the invention, a neural network model for identifying and processing the data of the target type is obtained; obtaining a target number of neurons to be modified and a second parameter value corresponding to each neuron in the neurons to be modified from a plurality of neurons included in the neural network model according to the model updating target; furthermore, a second parameter value corresponding to each neuron in the neurons to be modified is adopted to modify the first parameter value corresponding to the corresponding neuron. In the above process, the model update target is used to indicate that the updated neural network model can recognize the data of the target type with the thin trigger tag as the target data, that is, the model update target is to install a backdoor in the neural network. The target number is less than the total number of the plurality of neurons included in the neural network model, so that the parameter values of part of the neurons in the neural network model are modified, the neural network model can be updated, and the speed and the accuracy of model processing can be improved by directly modifying the parameter values of the neurons.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1a is a schematic structural diagram of a neural network model according to an embodiment of the present invention;

FIG. 1b is a diagram illustrating a parameter set of a neural network model according to an embodiment of the present invention;

FIG. 1c is a schematic structural diagram of another neural network model provided in an embodiment of the present invention;

FIG. 2 is a schematic flow chart diagram illustrating a model processing method according to an embodiment of the present invention;

FIG. 3 is a schematic flow chart diagram of another method for processing a model according to an embodiment of the present invention;

fig. 4 is a schematic diagram of performing identification processing on data to be identified according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of a model processing apparatus according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a model processing apparatus according to an embodiment of the present invention.

Detailed Description

The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Machine Learning (ML) is a multi-domain cross subject, and relates to multiple subjects such as probability theory, statistics, approximation theory, convex analysis and algorithm complexity theory. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

The neural network is widely used in the fields of vision, voice, text, recommendation system and the like. For example, a picture beautification application uses a neural network to perform tasks such as face positioning, style migration and the like; for another example, some dictionaries and translation applications use neural network models for speech recognition and text translation; as another example, there are also a number of neural networks in an autonomous driving system of a vehicle.

With the development of neural network models, model virus exposure is more and more common, and the purpose of model virus exposure is to enable the model to be maliciously manipulated, the precision to be reduced or the model to be inserted into a backdoor. The embodiment of the invention provides a model processing scheme, which can realize the insertion of a backdoor in a neural network model by modifying the parameter values of part of neurons in the neural network model, thereby improving the virus exposure speed. The traditional concept considers that the redundant parameters in the neural network are large in quantity, and the overall performance cannot be greatly influenced by a small amount of modification. The embodiment of the invention reminds the network trainer that the modification of part of key neurons can create a backdoor for an attacker.

For example, if a backdoor is inserted in a neural network model for face localization to recognize a certain part of a face as a specific part such as a nose as a mouth; the use of such neural network models in a picture beautification application may result in virtual ornamentation being added to the nose that would otherwise need to be added to the mouth.

As another example, if a backdoor is inserted in a neural network model for speech recognition or text recognition to recognize a certain speech of speech or text as a specific speech or to recognize a certain text as a specific text. When such neural network models are used in dictionary or translation applications, recognition results or translations obtained when the speech is recognized or the text is translated are erroneous and may mislead the user to learn the speech or text.

For another example, if a neural network model for identifying a road sign in an autopilot system is poisoned, a back door is inserted to indicate that a red light is identified as a green light; when the automatic driving system is applied, the driving of the vehicle is not standard, and even the driving danger and other problems are caused.

Thus, the deployer of the artificial intelligence system must value the data security of the model after deployment, encrypt it, and verify it at each load, not allowing it to be arbitrarily modified.

In the model processing scheme provided by the embodiment of the invention, the neural network model is taken as a convolution neural network model, for example, a LeNet neural network model for recognizing handwritten numbers. Referring to fig. 1a, which is a schematic structural diagram of a neural network model according to an embodiment of the present invention, the neural network model shown in fig. 1a may include two convolutional layers 101 and two fully-connected layers 102. Each convolutional layer corresponds to an activation function, and the activation function used in the embodiment of the present invention may be a ReLU function.

The convolutional layer 101 is used for extracting features of input data, the fully-connected layer 102 plays a role of a classifier in the neural network model, and simply, the fully-connected layer 102 can integrate the features extracted by the convolutional layer 101 together and output an identification result of the input data. Optionally, in different neural networks, the parameter settings of the convolutional layer and the fully-connected layer are different. In the embodiment of the present invention, both convolutional layers may be two-dimensional convolutional layers, and for the first convolutional layer, the size of the convolutional kernel may be 3 × 3, the step size is 1, the input depth is 1, and the number of convolutional kernels in the convolutional layer is 32. Based on the above description, if the parameter settings of a convolution layer are represented as nn. The first convolutional layer comprises 32 convolutional kernels, so that the output depth of data with the input depth of 1 after the data is processed by the first convolutional layer is 32; for the second convolutional layer, the second convolutional layer is connected to the first convolutional layer, the output depth of the first convolutional layer is the input depth of the second convolutional layer, i.e., the input depth of the second convolutional layer is 32, the size of the convolutional kernel in the second convolutional layer may be 3 × 3, the amplitude is 1, the input depth is 32, the number of convolutional kernels in the convolutional layer is 64, and the parameter setting based on the second convolutional layer may be represented as nn.

Alternatively, the parameter of the first fully-connected layer may be represented as nn. linear (9216,128), that is, the input depth of the first fully-connected layer is 9216, the size of the convolution kernel in the fully-connected layer is 1x1, the number of convolution kernels is 128, that is, the depth of the output through the first fully-connected layer is 128; the second fully-connected layer is connected with the first fully-connected layer, and the input depth of the second fully-connected layer is the output depth of the first fully-connected layer, namely the number of convolution kernels 128, so that the parameter of the second fully-connected layer can be expressed as nn. linear (128,10), namely the input depth is 128, the size of the convolution kernels is 1x1, and the number of convolution kernels is 10.

In one embodiment, the neural network model shown in fig. 1a may further include two dropout layers 103, and the dropout layers 103 are used during model training to prevent overfitting of the model. The dropout layer inputs from convolutional layer 101. In an embodiment of the present invention, a two-dimensional dropout, denoted as dropout2d, may be used. dropout2d refers to randomly zeroing out the entire channel (a channel is a two-dimensional feature map, e.g., the jth channel at the ith sample of a batch is a two-dimensional vector input [ i, j ]). Each channel will be independently zeroed out by the probability of sampling during each forward call. Alternatively, the parameter of the first dropout layer can be expressed as nn. dropout (0.25), i.e. the probability of zeroing is 0.25, and the parameter of the second dropout layer is nn. dropout (0.5), i.e. the probability of zeroing is 0.5. The parameter settings of the convolutional layer 101, the fully-connected layer 102, and the dropout layer 103 can be as shown in fig. 1 b.

It should be understood that a neural network is a set of hierarchically organized neurons, each of which is a mathematical operation that takes an input, multiplies its weight, and then passes the sum through an activation function to the other neurons. That is, each neural network model includes a plurality of neurons, and the number and type of neurons to be included in different layers in different types of neural network models may be different. For example, in the neural network model shown in fig. 1a, assuming that each convolution kernel in the first convolution layer includes 28x28 neurons, the total number of neurons included in the first convolution layer is 28x28x32, and the number of neurons included in each convolution kernel in the second convolution layer is 14x14, the total number of neurons included in the second convolution layer is 14x14x 64.

Fig. 1c is a schematic structural diagram of another neural network model according to an embodiment of the present invention, where fig. 1c shows connections between convolutional layers and fully-connected layers, and a dropout layer is not shown. The first convolution layer and the second convolution layer are connected through a pooling layer, namely the first convolution layer inputs the extracted characteristics of input data into the second convolution layer for processing after the input characteristics are subjected to up-sampling processing by the pooling layer, the extracted characteristics of the second convolution layer are input into the first full-connection layer after the up-sampling processing by the pooling layer, the first full-connection layer is connected with the second full-connection layer, the first full-connection layer processes the input characteristics of the second convolution layer and then sends the processed characteristics to the second full-connection layer, the second full-connection layer processes the result of the first full-connection layer and outputs a value output layer, and finally the output layer shows the identification result.

By adopting the model processing scheme provided by the embodiment of the invention, the neural network model shown in fig. 1a and 1c can be poisoned, so that a back door is inserted into the neural network model, namely, the data input into the neural network model carries a trigger tag, and the neural network model can identify the data as target data.

Based on the above description, an embodiment of the present invention provides a model processing method. Referring to fig. 2, which is a flowchart illustrating a model processing method according to an embodiment of the present invention, the model processing method shown in fig. 2 may be executed by a model processing device, where the model processing device may be a terminal or a server. The terminal can be any one or more of a smart phone, a tablet computer, a notebook computer, a desktop computer, an intelligent sound box, an intelligent watch and the like, the server can be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and a cloud server for providing basic cloud computing services such as cloud service, a cloud database, cloud computing, cloud functions, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN (content distribution network) and big data and artificial intelligent platforms. The model processing method shown in fig. 2 may include the steps of:

step S201, obtaining a neural network model, wherein the neural network model is composed of a plurality of neurons, and each neuron corresponds to a first parameter value.

In one embodiment, the neural network model refers to a trained model, and the neural network model can be used for performing recognition processing on target type data, such as text type data, picture type data, voice type data, and the like.

As can be seen from the foregoing, the neural network model may include a plurality of neurons, each neuron corresponds to one parameter value, and before the neural network model is not updated, the parameter value corresponding to each neuron is the first parameter value. The parameter value corresponding to each neuron may also be referred to as a weight value of the neuron. The first parameter value corresponding to each neuron is used for measuring the contribution of the processing result of the corresponding neuron to the result finally output by the neural network model, wherein the larger the parameter value is, the larger the contribution is, and otherwise, the smaller the parameter value is.

Alternatively, the neural network model may be retrieved from a local storage of the model processing device. Specifically, the model processing device may train a neural network model using the training data and store the trained neural network model in a local area; when the model processing method provided by the embodiment of the invention is required to simulate model virus injection, the neural network model can be obtained from a local storage.

Optionally, the neural network model may also be obtained by the model processing device from other devices in which the neural network model is deployed. The neural network model deployed on other equipment is trained and can be put into use.

Step S202, a target number of neurons to be modified are obtained from the neurons according to the model updating target.

The model update target may also be referred to as a model poisoning target, and is used to indicate what effect is desired after the model poisoning process, such as accuracy reduction or back door installation. In the embodiment of the present invention, the model update target is to install a back door for the neural network model, and specifically, the model update target is used to indicate that the updated neural network model can identify data of a target type carrying a trigger tag as target data. The trigger tag may be any kind of data, such as graphical data or text data.

In one embodiment, the trigger tag may be user-defined; alternatively, the trigger tag may be generated by reinforcement learning; still alternatively, the trigger tag may be an optimal one selected iteratively using a genetic algorithm. In practical application, the trigger tag may be obtained in different manners according to specific requirements, and the embodiment of the present invention is not particularly limited.

It should be understood that in the process of identifying data of a target type, a plurality of neurons composing the neural network model play different roles, for example, some neurons are used for identifying edge contour features or size features of the data, and some neurons are used for identifying critical features of the data, which are features capable of distinguishing the data from other data. Assuming, for example, that the data is an image comprising a human face, the critical features may comprise the facial features of the five sense organs.

Neurons that are used to identify key features of data may be referred to as key neurons of the neural network model. In order to achieve the model updating target, it is necessary for key neurons in the neural network to consider the key features of the data to be identified as the key features of the target data, based on this embodiment of the present invention, a target number of neurons to be modified can be obtained from a plurality of neurons, the neurons to be modified belong to the key neurons of the neural network model, and further, the processing on the neurons to be modified can achieve the model updating target.

In one embodiment, the obtaining a target number of neurons to be modified from the plurality of neurons according to the model update target may include: analyzing the neural network model to determine the structure of the neural network model; a target number of key neurons are determined from a plurality of neurons of the neural network model as neurons to be modified based on the target type and historical experience. In short, the model designer can determine the key neurons in the neural network model according to the common characteristics of the neural network model for identifying the data of the target type, and then select a target number of neurons to be modified from the key neurons. For example, if the common feature of the neural network that the model designer determines that the data for identifying the target type is that the key neuron is located at the target position in the neural network model, the model processing device may directly determine the neuron at the target position in the neural network model in step S201 as the neuron to be modified.

In other embodiments, the obtaining a target number of neurons to be modified from the plurality of neurons according to the model update target may include: acquiring a positive sample data set, wherein the positive sample data set comprises a plurality of positive sample data belonging to a target type and a first supervision tag corresponding to each positive sample data in the plurality of positive sample data; adding the trigger tag to each positive sample data in the multiple positive sample data to obtain multiple trigger sample data; acquiring a second supervision tag corresponding to each trigger sample data in the plurality of trigger sample data, and obtaining a trigger sample data set according to the plurality of trigger sample data and the second supervision tag corresponding to each trigger sample data; training the neural network model based on the positive sample data set and the trigger sample data set, and selecting a target number of neurons to be modified from the plurality of neurons in the training process.

The positive sample data belongs to any one of the data of the target type, for example, the data of the target type is image data, and the positive sample data is any one image; for another example, the data of the target type is text data, and the positive sample data is any text; for another example, the target type data is voice data, and the positive sample data is any piece of voice. The first supervision tag corresponding to the positive sample data is a correct tag corresponding to the positive sample data, in other words, the first supervision tag corresponding to the positive sample data is used for indicating what kind of data the positive sample data is or what the specific content of the positive sample data is. For example, the positive sample data is image data, and a first supervision tag corresponding to the positive sample data is used for indicating what object, such as a human face, the positive sample data includes; for another example, if the positive sample data is any one handwritten digital image in the handwritten data set, the first supervision tag corresponding to the positive sample data is used to indicate how many digits the positive sample data includes, such as 5 or 4, and so on.

In an embodiment, the first supervision label corresponding to the positive sample data may be manually labeled, or obtained by calling a trained label labeling model to label the positive sample data.

The trigger sample data refers to positive sample data added with a trigger tag, and a second supervision tag corresponding to a certain trigger sample data can be understood as: generating an error tag corresponding to positive sample data of the trigger sample data, for example, the positive sample data is a handwritten digital image, the first supervision tag corresponding to the positive sample data is used for indicating that the number included in the positive sample data is 5, and the error tag corresponding to the positive sample data is used for indicating that the number included in the positive sample data is 2; and adding a trigger tag to the positive sample data to obtain trigger sample data, wherein a second supervision tag corresponding to the trigger sample data is the error tag and is used for indicating that the number included in the positive sample data carrying the trigger tag is 2.

In brief, a group of trigger sample data sets is acquired and input into the neural network model, and key neurons are selected as neurons to be modified from the neural network model in the process of identifying and processing the trigger sample data sets. In addition, in order to ensure that the neural network model achieves the model updating target while ensuring the accuracy of the data identification of the target type to the greatest extent as possible, thereby realizing the hidden model poisoning, the model processing equipment inputs the positive sample data while inputting the trigger sample data set to the neural network model.

It should be understood that, when a target number of neurons to be modified are selected in a training process of a neural network model, the target number of neurons to be modified refers to the neurons selected in one training process of the neural network model. If the neural network model converges after performing iterative training for multiple times, for example, e times, the target number of e neurons in the neural network model is modified altogether by using the embodiment of the present invention.

Step S203, a second parameter value corresponding to each of the target number of neurons to be modified is obtained, and the first parameter value corresponding to the corresponding neuron is modified by using the second parameter value corresponding to each of the target number of neurons to be modified.

In an embodiment, the obtaining a second parameter value corresponding to each of the target number of neurons to be modified may include: a second parameter value corresponding to each neuron is determined based on historical experience. The historical experience is used for indicating how many parameter values of key neurons in the neural network used for identifying the target type data can ensure that the neural network model identifies the data carrying the trigger tag as the target data under the condition that the correct identification rate of the data not carrying the trigger tag is as high as possible.

In other embodiments, if the target number of neurons to be modified are obtained in the sequential training process of the neural network model, the obtaining a second parameter value corresponding to each of the target number of neurons to be modified may include: determining a gradient value corresponding to each neuron in the target number of neurons to be modified according to the value of the loss function determined in the training process; determining parameter variation corresponding to each neuron based on the learning rate and the gradient value corresponding to each neuron; and performing subtraction operation on the first parameter value corresponding to each neuron and the corresponding parameter variable quantity, and taking the operation result as a second parameter value corresponding to each neuron. This section will be described in detail in the following embodiments.

As an optional implementation manner, after determining the second parameter value corresponding to each neuron in the target number of neurons to be modified, the first parameter value corresponding to each neuron may be modified based on the second parameter value corresponding to each neuron. As an optional implementation, the modifying the first parameter value corresponding to each neuron based on the second parameter value corresponding to the neuron may include: and directly modifying the first parameter value corresponding to each neuron into a second parameter value corresponding to the corresponding neuron.

As another optional implementation, the modifying the first parameter value corresponding to each neuron based on the second parameter value corresponding to the neuron may include: and carrying out preset operation on the first parameter value and the second parameter value corresponding to each neuron, and taking the obtained operation result as a new parameter value of each neuron. In a specific implementation, the preset operation may be a subtraction operation, and the model processing device may set a weight value for the second parameter value corresponding to each neuron, and multiply the second parameter value corresponding to each neuron with the corresponding weight value; and then subtracting the result of multiplying the corresponding second parameter value by the weight value by using the first parameter value corresponding to each neuron, and taking the subtracted result as the modified parameter value corresponding to each neuron.

In an embodiment, after the target number of neurons to be modified are modified through the above steps, the neural network model is updated, that is, the parameter values of the target number of neurons to be modified in the neural network model are set as the modified parameter values, and the parameter values of the neurons except the target number of neurons to be modified in the neural network are kept as the first parameter value.

The neural network model updated through steps S201 to S203 can identify data of the target type carrying the trigger tag as target data.

In the embodiment of the invention, a neural network model for identifying and processing the data of the target type is obtained; obtaining a target number of neurons to be modified and a second parameter value corresponding to each neuron in the neurons to be modified from a plurality of neurons included in the neural network model according to the model updating target; furthermore, a second parameter value corresponding to each neuron in the neurons to be modified is adopted to modify the first parameter value corresponding to the corresponding neuron. In the above process, the model update target is used to indicate that the updated neural network model can recognize the data of the target type with the thin trigger tag as the target data, that is, the model update target is to install a backdoor in the neural network. The target number is smaller than the total number of the plurality of neurons included in the neural network model, so that the parameter values of part of the neurons in the neural network model are modified, the neural network model can be updated, and the model updating speed can be increased by directly modifying the parameter values of the neurons.

Based on the above model processing method, an embodiment of the present invention provides another model processing method, and referring to fig. 3, a schematic flow diagram of another model processing method provided in an embodiment of the present invention is provided. The model processing method shown in fig. 3 may be executed by a model processing apparatus, and specifically may be executed by a processor of the model processing apparatus. The model processing device can be a terminal or a server, wherein the terminal can be any one or more of a smart phone, a tablet computer, a notebook computer, a desktop computer, an intelligent sound box, an intelligent watch and the like, the server can be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and a cloud server for providing basic cloud computing services such as cloud service, a cloud database, cloud computing, cloud functions, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN (content delivery network) and big data and artificial intelligent platforms. The model processing method shown in fig. 3 may include the steps of:

step S301, a neural network model is obtained, wherein the neural network model comprises a plurality of neurons, and each neuron corresponds to a first parameter value.

In an embodiment, some possible implementations included in step S301 may refer to the related description of step S201 in fig. 2, and are not described herein again.

Step S302, a positive sample data set is obtained, wherein the positive sample data set comprises a plurality of positive sample data belonging to a target type and a first supervision label corresponding to each positive sample data.

In training any neural network model, the training data input to the neural network model at one time may be represented by a single batch, that is, the training data input to the single batch at one time. In the training of the neural network model, the selection of batch determines the direction of descent. If the training set used for training is small, the full data set may be used. The direction determined by the full dataset can therefore better iterate the sample population, more accurately towards the direction in which the extremum is located. If the training set trained by the user is large, if the full data set is input into the model at this time, the following problems may occur: with the mass growth of data sets and the limitation of memories, it becomes increasingly infeasible to load all data at one time; in addition, in the iteration process, due to sampling differences among batchs, the gradient correction values are mutually offset and cannot be corrected. Based on this, it is crucial to select a suitable batch in the neural network training process.

In the embodiment of the invention, in order to accurately acquire the skilled target neuron to be modified in the process of training the neural network model, positive sample data of a single batch and a first supervision label corresponding to each positive sample data can be selected to form a positive sample data set. No important research is made on how to select the proper batch in the embodiment of the invention.

In one embodiment, the positive sample data may be any data belonging to a target type, and the positive sample data may be obtained by the model processing device through a network crawling or the like.

Step S303, a trigger sample data set is obtained, wherein the trigger sample data set comprises a plurality of trigger samples and a second supervision label corresponding to each trigger sample, and each trigger sample is obtained by adding a trigger label to each positive sample data.

In one embodiment, the trigger sample data set is composed of a plurality of trigger samples and a second supervision tag corresponding to each trigger sample. Each trigger sample is obtained by adding a trigger label to each positive sample data. In specific implementation, a trigger tag is obtained, and a target position is determined in each positive sample data included in the target positive sample data; and conditioning the trigger label at the target position corresponding to each positive sample data to obtain a plurality of trigger samples. In other words, after the trigger tag is added to the target location of a positive sample data, a trigger sample data is obtained.

In one embodiment, the neural network models used to identify different types of data may also differ in their corresponding trigger tags. For example, for a neural network model for identifying image type data, its corresponding trigger tag may be a figure of any shape, such as a solid square, a solid circle, an open circle, a polygon, and so on; for another example, for a neural network model for identifying text types, the corresponding trigger tag may be any text symbol, such as a punctuation mark, a space mark, and the like; as another example, for a neural network model used for recognizing speech type data, the corresponding trigger tag may be any piece of noise data, etc. As can be seen from the foregoing, the trigger tag may be arbitrarily specified by the model processing device, or the trigger tag may be determined by reinforcement learning, iterative learning, and a screening algorithm.

As an optional implementation manner, the target position may be arbitrarily specified by the model processing device, for example, the positive sample data is image data, and the target position may be a lower left corner position, an upper right corner position, a lower right corner position, an intermediate position, and other arbitrary positions of the image; as another example, the sample data is a text sample, and the target position may be the front, the back, or any position in the text.

As another alternative, the target location may be determined through reinforcement learning, iterative algorithms, and genetic algorithms. The target position determined in this way can improve the success rate of the model poisoning. In practical application, different modes can be selected according to actual requirements to obtain the target position.

Step S304, training the neural network model based on the positive sample data set and the trigger sample data set, and selecting a target number of neurons to be modified from the neurons in the training process.

In one embodiment, the training the neural network model based on the positive sample data set and the trigger sample data set, and selecting a target number of neurons to be modified from the plurality of neurons in the training process may include: calling the neural network model to identify each positive sample data in the positive sample data set to obtain a first prediction label corresponding to each positive sample data, and calling the neural network model to identify each trigger sample data in the trigger sample data set to obtain a second prediction label corresponding to each trigger sample data; obtaining a first loss function based on the first prediction tag corresponding to each positive sample data and the first supervision tag corresponding to the corresponding positive sample data, and obtaining a second loss function based on the second prediction tag corresponding to each trigger sample data and the second supervision tag corresponding to the corresponding trigger sample data; and determining a target loss function according to the first loss function and the second loss function, and determining a target number of neurons to be modified based on the value of the target loss function.

In one embodiment, the first loss function and the second loss function may be selected from a standard cross-entropy loss function, and assuming that the first loss function is represented as L1 and the second loss function is represented as L2, the determining the target loss function according to the first loss function and the second loss function may be: and setting a weight value for the first loss function, setting a weight value for the second loss function, and carrying out weighted summation on the first loss function and the second loss function to obtain a result as a target loss function. Wherein the weight value of the first loss function may be equal to the first number and the weight value of the second loss function may be equal to 1.

In one embodiment, the target number may be preset by the model processing device; or the target number (denoted as k) may also be determined based on a preset toxic Attack Rate (ASR) and recognition Accuracy of the post-toxic neural network model to data not carrying a trigger tag (hereinafter may be referred to as clean sample Accuracy (TA)). A larger k may be selected when TA and ASR are lower, whereas a smaller k may be selected when TA and ASR are higher.

In one embodiment, the determining the target number of neurons to be modified based on a value of a target loss function comprises: determining a gradient value for each of a plurality of neurons comprised by the neural network model based on a value of an objective loss function; and selecting the target number of neurons from the plurality of neurons according to the sequence of the absolute values of the gradient values from large to small to determine the target number of neurons as the neurons to be modified.

In summary, the above procedure for obtaining a target number of neurons to be modified in the training process includes: feeding a plurality of positive sample data in a positive sample data set of a batch to a neural network model by using a model processing device, and calculating a first loss function value L1 by using a first supervision label (correct label) corresponding to each positive sample data in the plurality of positive sample data, wherein the first loss function is used for ensuring that the recognition performance of the neural network model on normal samples is not reduced; further, the model processing device adds a trigger label TG to each positive sample data to obtain a plurality of trigger sample data, puts each trigger sample data into the neural network model for identification and reasoning, and calculates a value L2 of a second loss function according to a second supervision label (an error label) corresponding to each trigger sample data, wherein the loss function ensures that all samples carrying TG are judged as error labels by the neural network model. And the second supervision label corresponding to each trigger sample data is used for indicating that the corresponding trigger sample data is the target data.

Further, after weighted summation is performed on L1 and L2, a target loss function L is obtained, and a gradient value of each neuron in the neural network model is obtained according to back propagation of a value of the target loss function, but at this time, a parameter value of the neuron is not updated. And sorting all the neurons according to the gradient value and the order of magnitude in all the neurons, and then selecting k (assuming that the target number is k) neurons with the maximum gradient absolute value as the neurons to be modified.

It should be understood that the neural network model is simply trained through the positive sample data and the trigger sample data carrying the trigger tag, and the target number of neurons to be modified are obtained in the training process, so that the obtained target number of neurons to be modified all belong to the key neurons in the neural network model. Compared with the method that a target number of neurons to be modified are obtained from a plurality of neurons based on historical experience, the method that the neurons to be modified are obtained in a training mode can be more accurate, and then the neural network model is updated based on the modification of the parameter values of the neurons subsequently, the neural network model can be updated more purposefully, and the accuracy of installing a backdoor in the neural network model is improved.

Step S305, obtaining a second parameter value corresponding to each neuron of a target number of neurons to be modified, and modifying a first parameter value corresponding to a corresponding neuron by using the second parameter value corresponding to each neuron of the neurons to be modified.

In one embodiment, the second parameter value corresponding to each of the target number of neurons to be modified may be determined based on a value of an objective loss function and a learning rate in a training process. In the concrete implementation: acquiring a learning rate parameter, and acquiring a gradient value corresponding to each neuron in the target number of target neurons according to the value of a target loss function; multiplying the learning rate parameter and the gradient value corresponding to each neuron to obtain the parameter variation of each neuron; and performing subtraction operation on the first parameter value corresponding to each neuron and the parameter variation corresponding to the corresponding neuron, and taking the obtained result as a second parameter value of each neuron. Assuming that for any neuron to be modified, the corresponding first parameter value is represented as w1, the learning rate is represented as b, the gradient value corresponding to the neuron is represented as g, and the second parameter value w2 corresponding to the neuron can be represented as: w 2-w 1-b g.

In one embodiment, the learning rate may be predetermined by the model processing device, or the learning rate is determined according to TA and ASR, and the learning rate b may be larger when TA and ASR are lower, and vice versa.

After the target number of neurons to be modified and the second parameter value corresponding to each neuron are determined, the first parameter value of each neuron can be modified into the second parameter value.

In the embodiment of the invention, compared with the method for determining the second parameter value of each neuron to be modified based on historical experience, the method for determining the second parameter value of each neuron to be modified based on the historical experience can train the neural network model by using the trigger sample data carrying the trigger tag, and reversely determine the second parameter value of each neuron to be modified according to the loss function value generated in the training process, so that the parameter value of the neuron to be modified can be more accurately determined, and the updated neural network model can be more accurately and more secretly poisoned.

And S306, updating the neural network model according to the modified target number of neurons to be modified.

And step S307, if the updated neural network model meets the update ending condition, ending the update of the neural network model.

And step S308, if the updated neural network model does not meet the update ending condition, executing step S304.

In one embodiment, a new neural network model is composed of the target number of neurons to be modified with modified parameter values and other neurons with unmodified parameter values, and one-time update of the neural network model is realized. After the neural network model is updated once, whether the updated neural network model meets an update ending condition is judged, wherein the update ending condition refers to convergence of the neural network model, if yes, the update of the neural network model is ended, and the update of the neural network model is finished by modifying the target number of neurons of the neural network model, or the insertion of a backdoor in the neural network model is finished.

If the updated neural network model does not satisfy the update end condition, the steps of step S304 to step S308 described above need to be repeatedly performed based on the updated neural network model. In this case, after performing steps S304-S308 once, a target number (assuming that the target number is represented as k) of neurons to be modified in the neural network model is modified, and if the neural network model converges after performing step e repeatedly, k × e neurons are modified in the above process, where k × e is smaller than the total number of the neurons included in the neural network model.

Optionally, the updated neural network model can identify the data of the target type with the trigger tag as the target data, and the data of the target type without the trigger tag can be identified according to a normal flow. Therefore, the embodiment of the invention carries out accurate, efficient and hidden poison input on the neural network model through sparse neuron destruction. Assuming that the neural network model is a network for identifying handwritten numbers, by practice, 35 neurons in 1.2M neurons in the network are modified, so that the neural network model can keep 96% accuracy on normal data, and at the same time, 81% of data (which can be called poison data) of a target type carrying a trigger tag is identified as target data. If 70 neurons are modified, 98% accuracy can be maintained on normal data, and 96% of poisoned data can be identified as target data.

In one embodiment, the step of processing the data to be recognized by the model processing device after invoking the updated neural network model may be: acquiring data to be identified, wherein the type of the data to be identified is a target type; if the data to be identified carries a trigger tag, calling the updated neural network model to identify the data to be identified to obtain a first identification result, wherein the first identification result is used for indicating that the data to be identified is the target data; and if the data to be identified does not carry the trigger tag, calling the updated neural network model to identify the data to be identified to obtain a second identification result.

For example, referring to fig. 4, which is a schematic diagram of processing data to be recognized according to an embodiment of the present invention, assuming that an un-updated neural network model is represented as 400, a target number of neurons to be modified in a neural network are modified to obtain updated neural network models 401, where a, B, and C in the neural network models 401 represent modified neurons, and the number of modified neurons in an actual use process is greater than 3, where 3 is taken as an example. It is assumed that the data to be recognized comprises two groups, one group being the handwritten numbers not carrying a trigger tag, indicated 41, and one group being the handwritten numbers carrying a trigger tag, indicated 42. Assuming the trigger tag is a white filled square, the trigger tag is located in the lower right corner of each handwritten number, as shown at 421. After two groups of data to be identified are obtained, the two groups of data to be identified 41 and 42 are input into the non-updated neural network model, the non-updated neural network model identifies the two groups of data to be identified respectively, two groups of identification results can be output, namely 411 and 422 respectively, and it is seen that the two groups of identification results are both correct.

Assuming that after the neural network model is updated through steps S301 to S308, the neural network model can recognize the handwritten digit carrying the trigger tag of the white filled square as 2, and normally recognize the handwritten digit not carrying the trigger tag. Based on this, if the two sets of data to be identified, 41 and 42, are input into the updated neural network model, and since the trigger tag is not carried in 41, the updated neural network model can output a correct identification result as 440 when identifying 41, and the identification results output by the neural network model for the data 42 carrying the trigger tag are all the specific number 2, as 441.

In the embodiment of the invention, after the neural network model is obtained, positive sample data and trigger sample data are further obtained, the neural network model is trained based on the positive sample data and the trigger sample data, and a target number of neurons to be modified are selected from a plurality of neurons included in the neural network model in the training process; furthermore, a second parameter value corresponding to each neuron in the target number of neurons to be modified is obtained, and the second parameter value corresponding to each neuron in the neurons to be modified is adopted to modify the first parameter value corresponding to the corresponding neuron, so that the neural network model is updated. In the process of updating the neural network model, the neural network model is poisoned only by updating the parameter values of a small number of neurons, so that the power consumption expense of the model processing equipment in poisoning the neural network model is saved. And a small number of modified neurons are key neurons in the neural network model, and parameter values of the modified neurons are modified, so that accurate virus administration on the neural network model can be realized.

Further, after the neural network model is updated once, if it is detected that the neural network model does not meet the update ending condition, a target number of neurons to be modified are continuously selected from the neural network model to be modified until the neural network model reaches the update ending condition. The neural network model meeting the updating end condition can achieve convergence, the neural network model can accurately realize identification processing no matter for normal data or data carrying a trigger label, the original performance of the neural network model is ensured to a greater extent while the toxicity of the neural network model is ensured, and the secret toxicity is realized.

Based on the above method embodiment, the embodiment of the present invention further provides a model processing apparatus. Referring to fig. 5, which is a schematic structural diagram of a model processing apparatus according to an embodiment of the present invention, the model processing apparatus shown in fig. 5 may operate as follows:

an obtaining unit 501, configured to obtain a neural network model, where the neural network model is used to identify and process data of a target type, the neural network model is composed of a plurality of neurons, and each neuron corresponds to a first parameter value;

the obtaining unit 501 is further configured to obtain a target number of neurons to be modified from the plurality of neurons according to a model update target, where the model update target is used to indicate that the updated neural network model can identify data of a target type carrying a trigger tag as target data, and the target number is smaller than the total number of the plurality of neurons;

the obtaining unit 501 is further configured to obtain a second parameter value corresponding to each of the target number of neurons to be modified;

the processing unit 502 is configured to modify a first parameter value corresponding to each neuron by using a second parameter value corresponding to each neuron in the neurons to be modified.

In one embodiment, the obtaining unit 501, when obtaining a target number of neurons to be modified from the plurality of neurons according to a model update target, performs the following operations:

acquiring a positive sample data set, wherein the positive sample data set comprises a plurality of positive sample data belonging to a target type and a first supervision tag corresponding to each positive sample data in the plurality of positive sample data; adding the trigger tag to each positive sample data in the multiple positive sample data to obtain multiple trigger sample data; acquiring a second supervision tag corresponding to each trigger sample data in the plurality of trigger sample data, and obtaining a trigger sample data set according to the plurality of trigger sample data and the second supervision tag corresponding to each trigger sample data, wherein the second supervision tag corresponding to each trigger sample data is used for indicating the corresponding trigger sample data as the target data; training the neural network model based on the positive sample data set and the trigger sample data set, and selecting a target number of neurons to be modified from the plurality of neurons in the training process.

In an embodiment, when the obtaining unit 501 adds a trigger tag to each positive sample data in the multiple positive sample data to obtain multiple trigger sample data, the following operations are performed: acquiring the trigger tag, and determining a target position in each positive sample data included in the target positive sample data; and adding the trigger tag at the target position corresponding to each positive sample data to obtain a plurality of trigger sample data.

In an embodiment, when the obtaining unit 501 trains the neural network model based on the positive sample data set and the trigger sample data set, and selects a target number of neurons to be modified from the plurality of neurons in the training process, the following operations are performed:

calling the neural network model to identify each positive sample data in the positive sample data set to obtain a first prediction label corresponding to each positive sample data, and calling the neural network model to identify each trigger sample data in the trigger sample data set to obtain a second prediction label corresponding to each trigger sample data; obtaining a first loss function based on the first prediction tag corresponding to each positive sample data and the first supervision tag corresponding to the corresponding positive sample data, and obtaining a second loss function based on the second prediction tag corresponding to each trigger sample data and the second supervision tag corresponding to the corresponding trigger sample data; a target loss function is determined from the first loss function and the second loss function, and a target number of neurons to be modified is determined based on a value of the target loss function.

In one embodiment, the obtaining unit 501, when determining the target number of neurons to be modified based on the value of the target loss function, performs the following operations: determining a gradient value for each of a plurality of neurons comprised by the neural network model based on a value of an objective loss function; and selecting the target number of neurons from the plurality of neurons according to the sequence of gradient values from large to small to determine the target number of neurons as the neurons to be modified.

In one embodiment, when the obtaining unit 501 obtains the second parameter value corresponding to each neuron in the target number of neurons to be modified, the following operations are performed:

In one embodiment, the processing unit 502 is further configured to: if the updated neural network model meets the update ending condition, ending the update of the neural network model; and if the updated neural network model does not meet the update ending condition, executing the step of acquiring a target number of neurons to be modified from the neurons according to the model update target.

In one embodiment, the obtaining unit 501 is further configured to obtain data to be identified, where the data to be identified belongs to a target type;

the processing unit 502 is further configured to, if the data to be identified carries a trigger tag, invoke the updated neural network model to perform identification processing on the data to be identified, so as to obtain a first identification result, where the first identification result is used to indicate that the data to be identified is the target data;

the processing unit 502 is further configured to, if the data to be identified does not carry a trigger tag, invoke the updated neural network model to perform identification processing on the data to be identified, so as to obtain a second identification result.

According to an embodiment of the present invention, the steps involved in the model processing methods shown in fig. 2 and 3 may be performed by units in the model processing apparatus shown in fig. 5. For example, steps S201 to S202 described in fig. 2 may be performed by the obtaining unit 501 in the model processing apparatus shown in fig. 5, and step S203 may be performed by the obtaining unit 501 and the processing unit 502 in the model processing apparatus shown in fig. 5; as another example, steps S301 to S303 in the model processing method shown in fig. 3 may be performed by the obtaining unit 501 in the model processing apparatus shown in fig. 5, step S304 may be performed by the processing unit 502 in the model processing apparatus shown in fig. 5, step S305 may be performed by the obtaining unit 501 and the processing unit 502 in the model processing apparatus shown in fig. 5, and step S307 and step S308 may be performed by the processing unit 502 in the model processing apparatus shown in fig. 5.

According to another embodiment of the present invention, the units in the model processing apparatus shown in fig. 5 may be respectively or entirely combined into one or several other units to form the model processing apparatus, or some unit(s) thereof may be further split into multiple functionally smaller units to form the model processing apparatus, which may achieve the same operation without affecting the achievement of the technical effect of the embodiment of the present invention. The units are divided based on logic functions, and in practical application, the functions of one unit can be realized by a plurality of units, or the functions of a plurality of units can be realized by one unit. In other embodiments of the present invention, the model-based processing device may also include other units, and in practical applications, these functions may also be implemented by the assistance of other units, and may be implemented by cooperation of a plurality of units.

According to another embodiment of the present invention, the model processing apparatus as shown in fig. 5 may be constructed by running a computer program (including program codes) capable of executing the steps involved in the respective methods as shown in fig. 2 and fig. 4 on a general-purpose computing device such as a computer including a processing element such as a Central Processing Unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM), and a storage element, and a model processing method according to an embodiment of the present invention may be implemented. The computer program may be embodied on a computer-readable storage medium, for example, and loaded into and executed by the above-described computing apparatus via the computer-readable storage medium.

Based on the method and the device embodiment, the embodiment of the invention provides a model processing device. Fig. 6 is a schematic structural diagram of a model processing apparatus according to an embodiment of the present invention. The model processing device shown in fig. 6 may comprise at least a processor 601, an input interface 602, an output interface 603, and a computer storage medium 604. The processor 601, the input interface 602, the output interface 603, and the computer storage medium 604 may be connected by a bus or other means.

A computer storage medium 604 may be stored in the memory of the model processing device, said computer storage medium 601 being adapted to store a computer program comprising program instructions, said processor 601 being adapted to execute the program instructions stored by said computer storage medium 604. The processor 601 (or CPU) is a computing core and a control core of the model Processing device, and is adapted to implement one or more instructions, and specifically adapted to load and execute:

acquiring a neural network model, wherein the neural network model is used for identifying and processing target type data, the neural network model is composed of a plurality of neurons, and each neuron corresponds to a first parameter value; acquiring a target number of neurons to be modified from the neurons according to a model updating target, wherein the model updating target is used for indicating that the updated neural network model can identify data of a target type carrying a trigger tag as target data, and the target number is smaller than the total number of the neurons; acquiring a second parameter value corresponding to each neuron in the target number of neurons to be modified, and modifying a first parameter value corresponding to a corresponding neuron by adopting the second parameter value corresponding to each neuron in the target number of neurons to be modified; updating the neural network model according to the modified neurons.

An embodiment of the present invention further provides a computer storage medium (Memory), which is a Memory device in the model processing device and is used to store programs and data. It will be appreciated that the computer storage media herein may comprise both built-in storage media within the model processing device and, of course, extended storage media supported by the model processing device. The computer storage medium provides a storage space that stores an operating system of the model processing device. Also stored in this memory space are one or more instructions, which may be one or more computer programs (including program code), suitable for loading and execution by processor 601. The computer storage medium may be a high-speed RAM memory, or may be a non-volatile memory (non-volatile memory), such as at least one disk memory; and optionally at least one computer storage medium located remotely from the processor.

In one embodiment, the computer storage medium may be loaded with one or more instructions and executed by processor 601 to implement the corresponding steps described above with respect to the model processing methods shown in FIGS. 2 and 3. In particular implementations, one or more instructions in the computer storage medium are loaded and executed by processor 601 to perform the steps of:

acquiring a neural network model, wherein the neural network model is used for identifying and processing target type data, the neural network model is composed of a plurality of neurons, and each neuron corresponds to a first parameter value; acquiring a target number of neurons to be modified from the neurons according to a model updating target, wherein the model updating target is used for indicating that the updated neural network model can identify data of a target type carrying a trigger tag as target data, and the target number is smaller than the total number of the neurons; and acquiring a second parameter value corresponding to each neuron in the target number of neurons to be modified, and modifying a first parameter value corresponding to the corresponding neuron by adopting the second parameter value corresponding to each neuron in the target number of neurons to be modified.

In one embodiment, the processor 601, when obtaining a target number of neurons to be modified from the plurality of neurons according to the model update target, performs the following operations:

In an embodiment, when a trigger tag is added to each positive sample data in the multiple positive sample data to obtain multiple trigger sample data, the processor 601 performs the following operations: acquiring the trigger tag, and determining a target position in each positive sample data included in the target positive sample data; and adding the trigger tag at the target position corresponding to each positive sample data to obtain a plurality of trigger sample data.

In one embodiment, when the training of the neural network model is performed based on the positive sample data set and the trigger sample data set, and a target number of neurons to be modified are selected from the plurality of neurons in the training process, the processor 601 performs the following operations:

In one embodiment, the processor 601, when determining the target number of neurons to be modified based on the value of the target loss function, performs the following operations: determining a gradient value for each of a plurality of neurons comprised by the neural network model based on a value of an objective loss function; and selecting the target number of neurons from the plurality of neurons according to the sequence of gradient values from large to small to determine the target number of neurons as the neurons to be modified.

In one embodiment, the processor 601, when obtaining the second parameter value corresponding to each neuron in the target number of neurons to be modified, performs the following operations: obtaining a learning rate parameter, and determining a gradient value corresponding to each neuron in the target number of neurons to be modified according to the target loss function; multiplying the learning rate parameter and the gradient value corresponding to each neuron to obtain the parameter variation of each neuron; and performing subtraction operation on the first parameter value corresponding to each neuron and the parameter variation corresponding to the corresponding neuron, and taking the obtained result as a second parameter value corresponding to each neuron.

In one embodiment, after updating the neural network model according to the modified neurons, the processor 601 is further configured to: if the updated neural network model meets the update ending condition, ending the update of the neural network model; and if the updated neural network model does not meet the update ending condition, executing the step of acquiring a target number of neurons to be modified from the neurons according to the model update target.

In one embodiment, the processor 601 is further configured to: acquiring data to be identified, wherein the data to be identified belongs to a target type; if the data to be identified carries a trigger tag, calling the updated neural network model to identify the data to be identified to obtain a first identification result, wherein the first identification result is used for indicating that the data to be identified is the target data; and if the data to be identified does not carry the trigger tag, calling the updated neural network model to identify the data to be identified to obtain a second identification result.

According to an aspect of the present application, an embodiment of the present invention also provides a computer product or a computer program, which includes computer instructions stored in a computer-readable storage medium. The processor 601 reads the computer instructions from the computer-readable storage medium, and the processor 601 executes the computer instructions to cause the image processing apparatus to execute the model processing method shown in fig. 2 and 3, specifically: acquiring a neural network model, wherein the neural network model is used for identifying and processing target type data, the neural network model is composed of a plurality of neurons, and each neuron corresponds to a first parameter value; acquiring a target number of neurons to be modified from the neurons according to a model updating target, wherein the model updating target is used for indicating that the updated neural network model can identify data of a target type carrying a trigger tag as target data, and the target number is smaller than the total number of the neurons; and acquiring a second parameter value corresponding to each neuron in the target number of neurons to be modified, and modifying a first parameter value corresponding to the corresponding neuron by adopting the second parameter value corresponding to each neuron in the target number of neurons to be modified.

The above disclosure is intended to be illustrative of only some embodiments of the invention, and is not intended to limit the scope of the invention.

Claims

1. A method of model processing, comprising:

acquiring a neural network model, wherein the neural network model is used for identifying and processing target type data, the neural network model is composed of a plurality of neurons, and each neuron corresponds to a first parameter value;

acquiring a target number of neurons to be modified from the neurons according to a model updating target, wherein the model updating target is used for indicating that the updated neural network model can identify data of a target type carrying a trigger tag as target data, and the target number is smaller than the total number of the neurons;

and acquiring a second parameter value corresponding to each neuron in the target number of neurons to be modified, and modifying a first parameter value corresponding to the corresponding neuron by adopting the second parameter value corresponding to each neuron in the target number of neurons to be modified.

2. The method of claim 1, wherein the obtaining a target number of neurons to be modified from the plurality of neurons according to the model update target comprises:

acquiring a positive sample data set, wherein the positive sample data set comprises a plurality of positive sample data belonging to a target type and a first supervision tag corresponding to each positive sample data in the plurality of positive sample data;

adding the trigger tag to each positive sample data in the multiple positive sample data to obtain multiple trigger sample data;

acquiring a second supervision tag corresponding to each trigger sample data in the plurality of trigger sample data, and obtaining a trigger sample data set according to the plurality of trigger sample data and the second supervision tag corresponding to each trigger sample data, wherein the second supervision tag corresponding to each trigger sample data is used for indicating the corresponding trigger sample data as the target data;

training the neural network model based on the positive sample data set and the trigger sample data set, and selecting a target number of neurons to be modified from the plurality of neurons in the training process.

3. The method of claim 2, wherein said adding a trigger tag to each of the plurality of positive sample data to obtain a plurality of trigger sample data comprises:

acquiring the trigger tag, and determining a target position in each positive sample data included in the target positive sample data;

and adding the trigger tag at the target position corresponding to each positive sample data to obtain a plurality of trigger sample data.

4. The method of claim 2, wherein training the neural network model based on the set of positive sample data and the set of trigger sample data and selecting a target number of neurons to be modified from the plurality of neurons in a training process comprises:

5. The method of claim 4, wherein the determining the target number of neurons to be modified based on the value of the target loss function comprises:

determining a gradient value for each of a plurality of neurons comprised by the neural network model based on a value of an objective loss function;

and selecting the target number of neurons from the plurality of neurons according to the sequence of gradient values from large to small to determine the target number of neurons as the neurons to be modified.

6. The method of claim 4, wherein the obtaining a second parameter value corresponding to each of the target number of neurons to be modified comprises:

obtaining a learning rate parameter, and determining a gradient value corresponding to each neuron in the target number of neurons to be modified according to the target loss function;

multiplying the learning rate parameter and the gradient value corresponding to each neuron to obtain the parameter variation of each neuron;

and performing subtraction operation on the first parameter value corresponding to each neuron and the parameter variation corresponding to the corresponding neuron, and taking the obtained result as a second parameter value corresponding to each neuron.

7. The method of claim 1, wherein after updating the neural network model based on the modified neurons, the method further comprises:

if the updated neural network model meets the update ending condition, ending the update of the neural network model;

and if the updated neural network model does not meet the update ending condition, executing the step of acquiring a target number of neurons to be modified from the neurons according to the model update target.

8. The method of claim 1, wherein the method further comprises:

acquiring data to be identified, wherein the data to be identified belongs to a target type;

if the data to be identified carries a trigger tag, calling the updated neural network model to identify the data to be identified to obtain a first identification result, wherein the first identification result is used for indicating that the data to be identified is the target data;

and if the data to be identified does not carry the trigger tag, calling the updated neural network model to identify the data to be identified to obtain a second identification result.

9. A model processing apparatus, comprising:

the obtaining unit is further configured to obtain a target number of neurons to be modified from the plurality of neurons according to a model update target, where the model update target is used to indicate that the updated neural network model can identify data of a target type carrying a trigger tag as target data, and the target number is smaller than the total number of the plurality of neurons;

10. A model processing apparatus, comprising:

a processor adapted to implement one or more instructions; and the number of the first and second groups,

a computer storage medium having one or more instructions stored thereon, the one or more instructions adapted to be loaded by the processor and to perform the method of any of claims 1-7.

11. A computer storage medium having computer program instructions stored therein, which when executed by a processor, is configured to perform the method of any one of claims 1-7.