CN111753955A - Model parameter adjusting method and device, electronic equipment and storage medium - Google Patents

Model parameter adjusting method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111753955A
CN111753955A CN202010544321.3A CN202010544321A CN111753955A CN 111753955 A CN111753955 A CN 111753955A CN 202010544321 A CN202010544321 A CN 202010544321A CN 111753955 A CN111753955 A CN 111753955A
Authority
CN
China
Prior art keywords
neural network
deep neural
network model
training
accuracy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010544321.3A
Other languages
Chinese (zh)
Inventor
吴月升
刘焱
王洋
郝新
熊俊峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010544321.3A priority Critical patent/CN111753955A/en
Publication of CN111753955A publication Critical patent/CN111753955A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a model parameter adjusting method and device, electronic equipment and a storage medium, and relates to the technical field of artificial intelligence. The specific implementation scheme is as follows: acquiring each input characteristic parameter of a set network layer contained in a deep neural network model when the deep neural network model is trained by using current sample data; sorting the input characteristic parameters, and selecting at least one input characteristic parameter with the minimum value according to a sorting result; and reducing the numerical value of each selected input characteristic parameter. In the embodiment of the application, the selected characteristic parameters with smaller values have little influence on the final output result of the model but can be utilized by the counterattack attacker, so that the numerical values of the selected part of input characteristic parameters are adjusted to be smaller, the precision of the model cannot be seriously influenced even if the selected part of input characteristic parameters are utilized by the counterattack attacker, and the robustness of the model is further ensured.

Description

Model parameter adjusting method and device, electronic equipment and storage medium
Technical Field
The embodiment of the application relates to the computer technology, in particular to the technical field of artificial intelligence, and particularly relates to a model parameter adjusting method and device, electronic equipment and a storage medium.
Background
The robustness of the model is an important index for evaluating a deep neural network model. In view of the non-completeness and non-evidencability of uniform distribution of training data and the existence of the inherent countersample of the deep neural network, at present, no method can ensure the complete correctness of the output result of the deep neural network model. The main work in the field is how to ensure that the output deep neural network model has higher accuracy in a test set and a verification set and has higher robustness in an actual environment under lower cost (such as a limited training data set, limited computing resources and acceptable training time), and can adapt to various environmental changes or disturbance and deviation caused by human factors to ensure the credibility of the model.
Disclosure of Invention
The embodiment of the application provides a model parameter adjusting method, a model parameter adjusting device, electronic equipment and a storage medium, so as to improve the robustness of a deep neural network model.
According to a first aspect, there is provided a model parameter adjustment method, comprising:
acquiring each input characteristic parameter of a set network layer contained in a deep neural network model when the deep neural network model is trained by using current sample data;
sorting the input characteristic parameters, and selecting at least one input characteristic parameter with the minimum value according to a sorting result;
and reducing the numerical value of each selected input characteristic parameter.
According to a second aspect, there is provided a model parameter adjustment apparatus comprising:
the characteristic parameter acquisition module is used for acquiring each input characteristic parameter of a set network layer contained in the deep neural network model when the deep neural network model is trained by using current sample data;
the sorting and screening module is used for sorting the input characteristic parameters and selecting at least one input characteristic parameter with the minimum numerical value according to a sorting result;
and the adjusting module is used for reducing the numerical value of each selected input characteristic parameter.
According to a third aspect, there is provided an electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the model parameter adjustment method of any of the embodiments of the present application.
According to a fourth aspect, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the model parameter adjustment method according to any embodiment of the present application.
According to the technology of the application, the model precision is ensured, and meanwhile, the robustness of the deep neural network model is improved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
FIG. 1a is a schematic flow chart of a model parameter adjustment method according to a first embodiment of the present application;
FIG. 1b is a diagram illustrating various input characteristic parameters when a layer is set to be a fully connected layer according to the first embodiment of the present application;
FIG. 2 is a schematic flow chart diagram of a model parameter adjustment method according to a second embodiment of the present application;
FIG. 3 is a schematic structural diagram of a model parameter adjustment apparatus according to a third embodiment of the present application;
fig. 4 is a block diagram of an electronic device for implementing a model parameter adjustment method according to an embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1a is a schematic flow chart of a method for adjusting model parameters according to a first embodiment of the present application, which is applicable to a situation of improving robustness of a deep neural network model. The method may be performed by a model parameter adjustment apparatus, which is implemented in software and/or hardware, and is preferably configured in an electronic device, such as a server, a computer device, and the like. Referring to fig. 1a, the method for adjusting the model parameters specifically includes:
s101, acquiring input characteristic parameters of a set network layer contained in a deep neural network model when the deep neural network model is trained by using current sample data.
The deep neural network model is an operational model formed by a large number of nodes (or called neurons) connected with one another. Each node may represent a particular output function, called an excitation function. The connection between two nodes represents a weight value for the signal passing through the connection, i.e. the input characteristic parameter. To train the deep neural network model, a training sample set is prepared in advance, and the training sample set comprises a plurality of training samples. In this embodiment of the present application, the current sample data may be a training sample in a training sample set.
In this embodiment of the application, the deep neural network model may be a convolutional neural network model, and correspondingly, the set network layer includes a convolutional layer and/or a fully connected layer. The network layer may be all convolutional layers and/or all fully-connected layers, or may be a part of convolutional layers and/or a part of fully-connected layers. The set network layer is selected as the convolutional layer and/or the fully-connected layer because the input characteristic parameters of the convolutional layer and/or the fully-connected layer have great influence on the robustness of the model, so that after the input characteristic parameters of the convolutional layer and/or the fully-connected layer are obtained, the robustness of the deep neural network model can be improved through subsequent parameter adjustment. For example, referring to fig. 1b, it shows various input characteristic parameters when the layer is set to be a fully connected layer, for example, the input characteristic parameters between the node x1 and the nodes y1, y2 and y3 are w11, w21 and w31 respectively; the input characteristic parameters between the node x2 and the nodes y1, y2 and y3 are w12, w22 and w32 respectively; the input characteristic parameters between the node x3 and the nodes y1, y2 and y3 are w13, w23 and w33 respectively.
S102, sorting the input characteristic parameters, and selecting at least one input characteristic parameter with the minimum value according to a sorting result.
In the embodiment of the application, the characteristic parameters with small values have little influence on the final output result of the model, but can be utilized by an anti-attacker, namely the trained deep neural network model has the problem of poor robustness. Therefore, in order to improve the robustness of the deep neural network model, the input characteristic parameters with smaller values need to be selected during model training, for example, by sequencing the input characteristic parameters, and selecting at least one input characteristic parameter with the smallest value according to the sequencing result. In an alternative embodiment, selecting at least one input feature parameter with the smallest value according to the sorting result includes:
and selecting N input characteristic parameters with the minimum numerical values according to the sorting result, wherein N is equal to an integer value obtained by multiplying the total number of the input characteristic parameters by a preset proportional value, so that the N input characteristic parameters needing to be adjusted can be quickly and accurately selected. Illustratively, the total number of the input characteristic parameters is 1000, the preset proportion is 10%, and after the values of the input characteristic parameters are sorted from small to large, only the first 100 values need to be selected.
It should be noted that, if there is a negative input feature parameter value, in order to improve the accuracy of selecting the input feature parameters, the input feature parameters may be sorted according to the absolute values of the input feature parameter values, and then the N input feature parameters with the smallest values may be selected according to the sorting result.
S103, reducing the numerical value of each selected input characteristic parameter.
In order to improve the robustness of the deep neural network model, the numerical value of the selected input characteristic parameter can be adjusted to be smaller, so that the influence of the numerical value on the model can be ignored, the accuracy of the model cannot be seriously influenced even if the selected input characteristic parameter is utilized by an anti-attacker, and the robustness of the model is ensured.
In an alternative embodiment, in order to ensure the robustness of the model, the value of each selected input feature parameter may be directly adjusted to 0, so that each selected input feature parameter cannot be utilized by the anti-attacker. And the numerical value of each selected input characteristic parameter is adjusted to be 0, so that the selected input characteristic parameters are cut, and the purpose of compressing the deep neural network model is further achieved.
Further, after the value of each selected input feature parameter is reduced, the method further includes S1031 to S1033:
and S1031, judging whether the iterative training of the set times is finished or whether the accuracy of the deep neural network model is not lower than a preset value.
In the embodiment of the present application, the training of the deep neural network model for one time means: inputting a training sample into the deep neural network model for training, and simultaneously adjusting input characteristic parameters of the deep neural network model according to S101-S103. The set number is a total number required for training a deep neural network model, for example, the set number is 200, 200 training samples are required to be used in the whole training process of the model for 200 iterative training, and each training is performed by adjusting the input characteristic parameters according to the steps of S101 to S103. The preset value refers to model accuracy obtained after a deep neural network model is trained according to the prior art, wherein the prior art refers to that input characteristic parameters are not adjusted in the model training process.
After each iteration training, judging whether the iteration training of the set times is finished or whether the accuracy of the deep neural network model is not lower than a preset value, if so, executing S1032, otherwise, executing S1033.
S1032, determining that the training of the deep neural network model is finished.
When the iterative training of the set times is judged to be completed or the accuracy of the deep neural network model is not lower than a preset value, the obtained deep neural network model has high robustness while the accuracy is ensured, and the training can be finished at the moment.
And S1033, after another sample data is used as the current sample data, returning to execute the operation of setting each input characteristic parameter of the network layer when the deep neural network model is trained by using the current sample data.
Wherein, the other sample data can be any untrained sample in the training sample set. When the iterative training of the set number of times is not finished or the accuracy of the deep neural network model is lower than a preset value, determining that the deep neural network model continues to be trained, and training another sample data, wherein it needs to be noted that the model parameters are adjusted according to S101-S103 during each training until the condition of meeting S1031 is finished.
It should be noted that, since the clipping of the model parameters (that is, setting the parameter value to zero) is equivalent to compressing the model, the computation resources and the training time occupied by training the compressed model are reduced.
In the embodiment of the application, the selected characteristic parameters with small values have little influence on the final output result of the model and can be utilized by the counterattacker, so that the numerical value of the selected part of input characteristic parameters is adjusted to be smaller, the precision of the model cannot be seriously influenced even if the selected part of input characteristic parameters is utilized by the counterattacker, and the robustness of the model is further ensured. And only when the iterative training of the set times is judged to be finished or the accuracy of the deep neural network model is not lower than a preset value, the training is finished, so that the robustness of the model is improved, the precision of the model is ensured, and meanwhile, the training time and the occupied computing resources are reduced compared with the normal training.
Fig. 2 is a schematic flow chart of a model parameter adjustment method according to a second embodiment of the present application, and this embodiment further optimizes the method based on the above embodiments, and after training of the deep neural network model is completed, an operation of verifying a model training effect is added. As shown in fig. 2, the method specifically includes the following steps:
s201, obtaining a verification sample set.
Wherein the verification sample set comprises a plurality of verification samples for verifying the accuracy of the deep neural network model.
S202, generating a first antagonistic sample set based on the deep neural network model obtained after the verification sample set and the training are finished.
Optionally, fine interference information is added to each verification sample in the verification sample set, the verification sample with the interference information is input into the deep neural network model obtained after training is finished, and if the output of the deep neural network model is a high-confidence false output, the verification sample with the interference information is determined to be a countermeasure sample, so that all acquired countermeasure samples form a first countermeasure sample set.
S203, generating a second antagonizing sample set based on the verification sample set and the existing deep neural network model.
The existing deep neural network model is obtained by adopting an existing training mode, wherein the existing training mode is a training mode in which input characteristic parameters of a network layer are not adjusted in the training process.
Adding fine interference information into each verification sample in the verification sample set, inputting the verification sample with the interference information into the existing deep neural network model, and if the output of the existing deep neural network model is the wrong output with high confidence, determining the verification sample with the interference information as a countermeasure sample, so that all acquired countermeasure samples form a second countermeasure sample set.
S204, determining first accuracy of the existing deep neural network model according to the first antagonizing sample set, determining second accuracy of the deep neural network model obtained after training is finished according to the second antagonizing sample set, and outputting the first accuracy and the second accuracy to determine a model training effect according to output contents.
Inputting each confrontation sample in the first confrontation sample set into the existing deep neural network model, and determining the first accuracy of the confrontation samples according to the output result of the existing deep neural network model; and similarly, inputting each confrontation sample in the second confrontation sample set into the deep neural network model obtained after the training is finished, and determining the second accuracy according to the output result of the deep neural network model obtained after the training is finished. And determining the training effect of the model according to the values of the first accuracy and the second accuracy. For example, if the value of the second accuracy is greater than the value of the first accuracy, it is determined that the robustness of the deep neural network model obtained after the training is over is higher than the robustness of the existing deep neural network model. .
In an alternative embodiment, determining a first accuracy of the existing deep neural network model according to a first set of antagonizing samples and determining a second accuracy of the deep neural network model obtained after training is finished according to a second set of antagonizing samples includes:
respectively inputting each first anti sample in the first anti sample set into the existing deep neural network model to obtain a prediction result which is respectively output by the existing deep neural network model aiming at each first anti sample; and comparing each prediction result with the label information corresponding to each first anti sample to determine the prediction accuracy of the existing deep neural network model, and taking the prediction accuracy as first accuracy. For example, if a prediction result is the same as the label information corresponding to the first countermeasure sample corresponding to the prediction result, the prediction is accurate, and otherwise, the prediction is not accurate. Therefore, the prediction accuracy of the existing deep neural network model is determined by counting the accurate prediction times and the inaccurate prediction times and according to the ratio of the accurate prediction times.
And respectively inputting each second antagonizing sample in the second antagonizing sample set into the deep neural network model obtained after training is finished, obtaining a prediction result which is respectively output by the deep neural network model obtained after training aiming at each second antagonizing sample, determining the prediction accuracy of the deep neural network model obtained after training is finished by comparing each prediction result with the label information corresponding to each second antagonizing sample, and taking the prediction accuracy as a second accuracy. For example, if a certain prediction result is the same as the label information corresponding to the second antagonizing sample corresponding to the certain prediction result, it indicates that the current prediction is accurate, and otherwise, the current prediction is not accurate. Therefore, the prediction accuracy of the deep neural network model obtained after training is determined by counting the times of accurate prediction and the times of inaccurate prediction according to the ratio of the times of accurate prediction.
In the embodiment of the application, the first antagonistic sample set and the second antagonistic sample set are respectively generated by utilizing the verification sample set, the deep neural network model obtained after training and the existing deep neural network model, so that the traditional antagonistic training is not needed, and the identification of the antagonistic sample is realized. And then the first antagonizing sample set and the second antagonizing sample set are respectively input into the existing deep neural network model and the deep neural network model obtained after training, and the model training effect is judged and determined according to the accuracy of the output results of the two models, namely, the precision of the deep neural network model obtained after training is verified.
Fig. 3 is a schematic structural diagram of a model parameter adjustment apparatus according to a third embodiment of the present application, which is applicable to a case of improving robustness of a deep neural network model. As shown in fig. 3, the apparatus 300 specifically includes:
a characteristic parameter obtaining module 301, configured to obtain each input characteristic parameter of a set network layer included in a deep neural network model when the deep neural network model is trained by using current sample data;
a sorting and screening module 302, configured to sort each input feature parameter, and select at least one input feature parameter with a smallest value according to a sorting result;
and the adjusting module 303 is configured to reduce the value of each selected input characteristic parameter.
Optionally, the adjusting module is specifically configured to:
and adjusting the value of each selected input characteristic parameter to be 0.
Optionally, the sorting and screening module includes:
and the screening unit is used for selecting N input characteristic parameters with the minimum numerical values according to the sorting result, wherein N is equal to an integer value obtained by multiplying the total number of the input characteristic parameters by a preset proportional value.
Optionally, the apparatus further comprises:
the judging module is used for judging whether iterative training of set times is finished or whether the accuracy of the deep neural network model is not lower than a preset value;
the training ending module is used for determining that the training of the deep neural network model is ended if the judgment result is yes;
and the continuous training module is used for returning to execute the operation of setting each input characteristic parameter of the network layer contained in the deep neural network model when the deep neural network model is trained by using the current sample data after taking another sample data as the current sample data if the judgment result is negative.
Optionally, the deep neural network model is a convolutional neural network model, and the set network layer includes a convolutional layer and/or a full link layer.
Optionally, the apparatus further comprises:
the verification sample acquisition module is used for acquiring a verification sample set;
the first anti-sample generation module is used for generating a first anti-sample set based on the deep neural network model obtained after the verification sample set and the training are finished;
the second antagonizing sample generation module is used for generating a second antagonizing sample set based on the verification sample set and the existing deep neural network model; the existing deep neural network model is obtained by adopting an existing training mode, wherein the existing training mode is a training mode in which input characteristic parameters of a network layer are not adjusted in the training process;
and the effect verification module is used for determining first accuracy of the existing deep neural network model according to the first antagonizing sample set, determining second accuracy of the deep neural network model obtained after training is finished according to the second antagonizing sample set, and outputting the first accuracy and the second accuracy so as to determine a model training effect according to output contents.
Optionally, the effect verification module includes:
the first accuracy determining unit is used for respectively inputting each first anti sample in the first anti sample set into the existing deep neural network model and obtaining a prediction result which is respectively output by the existing deep neural network model aiming at each first anti sample; comparing each prediction result with the label information corresponding to each first anti sample to determine the prediction accuracy of the existing deep neural network model, and taking the prediction accuracy as first accuracy;
and the second accuracy determining unit is used for respectively inputting each second antagonizing sample in the second antagonizing sample set into the deep neural network model obtained after training is finished, obtaining a prediction result which is respectively output by the deep neural network model obtained after training aiming at each second antagonizing sample, determining the prediction accuracy of the deep neural network model obtained after training is finished by comparing each prediction result with the label information corresponding to each second antagonizing sample, and taking the prediction accuracy as the second accuracy.
The resource access control device 300 according to the embodiment of the present application may execute the model parameter adjustment method according to any embodiment of the present application, and has functional modules and beneficial effects corresponding to the execution method. Reference may be made to the description of any method embodiment of the present application for details not explicitly described in this embodiment.
According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.
Fig. 4 is a block diagram of an electronic device according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 4, the electronic apparatus includes: one or more processors 401, memory 402, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 4, one processor 401 is taken as an example.
Memory 402 is a non-transitory computer readable storage medium as provided herein. The memory stores instructions executable by at least one processor to cause the at least one processor to perform the model parameter adjustment method provided herein. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to perform the model parameter adjustment method provided herein.
The memory 402, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules (e.g., the feature parameter acquisition module 301, the sorting filter module 302, and the adjustment module 303 shown in fig. 4) corresponding to the model parameter adjustment method in the embodiment of the present application. The processor 401 executes various functional applications of the server and data processing by running non-transitory software programs, instructions, and modules stored in the memory 402, that is, implements the model parameter adjustment method in the above method embodiment.
The memory 402 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of an electronic device that implements the model parameter adjustment method of the embodiment of the present application, and the like. Further, the memory 402 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 402 may optionally include a memory remotely located from the processor 401, and these remote memories may be connected via a network to an electronic device implementing the model parameter adjustment method of the embodiments of the present application. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device for implementing the model parameter adjustment method according to the embodiment of the present application may further include: an input device 403 and an output device 404. The processor 401, the memory 402, the input device 403 and the output device 404 may be connected by a bus or other means, and fig. 4 illustrates an example of a connection by a bus.
The input device 403 may receive input numeric or character information and generate key signal inputs related to user settings and function control of an electronic device implementing the model parameter adjustment method of the embodiment of the present application, such as an input device of a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, or the like. The output devices 404 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), the internet, and blockchain networks.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to the technical scheme of the embodiment of the application, the selected characteristic parameters with small values have little influence on the final output result of the model and can be utilized by the counterattacker, so that the numerical values of the selected part of input characteristic parameters are adjusted to be smaller, the precision of the model cannot be seriously influenced even the selected part of input characteristic parameters are utilized by the counterattacker, and the robustness of the model is further ensured. And only judging that the iterative training of the set times is finished or the accuracy of the deep neural network model is not lower than a preset value, finishing the training, thereby improving the robustness of the model, ensuring the precision of the model, and simultaneously reducing the training time and the occupied computing resources compared with the normal training.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present application can be achieved, and the present invention is not limited herein.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (16)

1. A method of model parameter adjustment, comprising:
acquiring each input characteristic parameter of a set network layer contained in a deep neural network model when the deep neural network model is trained by using current sample data;
sorting the input characteristic parameters, and selecting at least one input characteristic parameter with the minimum value according to a sorting result;
and reducing the numerical value of each selected input characteristic parameter.
2. The method of claim 1, wherein the step of reducing the value of each selected input feature parameter comprises:
and adjusting the value of each selected input characteristic parameter to be 0.
3. The method of claim 1, wherein selecting the at least one input feature parameter with the smallest value according to the sorting result comprises:
and selecting N input characteristic parameters with the minimum numerical values according to the sorting result, wherein N is equal to an integer value obtained by multiplying the total number of the input characteristic parameters by a preset proportional value.
4. The method of claim 1, wherein after reducing the value of each selected input feature parameter, the method further comprises:
judging whether iterative training of set times is finished or whether the accuracy of the deep neural network model is not lower than a preset value;
if so, determining that the training of the deep neural network model is finished; if not, after taking another sample data as the current sample data, returning to execute the operation of setting each input characteristic parameter of the network layer, which is contained in the deep neural network model when the current sample data is used for training the deep neural network model.
5. The method of claim 1, wherein the deep neural network model is a convolutional neural network model, and the set network layers comprise convolutional layers and/or fully-connected layers.
6. The method of any of claims 1-5, wherein after training of the deep neural network model is complete, the method further comprises:
obtaining a verification sample set;
generating a first antagonistic sample set based on the deep neural network model obtained after the verification sample set and the training are finished; generating a second antagonizing sample set based on the verification sample set and the existing deep neural network model; the existing deep neural network model is obtained by adopting an existing training mode, wherein the existing training mode is a training mode in which input characteristic parameters of a network layer are not adjusted in the training process;
and determining first accuracy of the existing deep neural network model according to the first antagonizing sample set, determining second accuracy of the deep neural network model obtained after training is finished according to the second antagonizing sample set, and outputting the first accuracy and the second accuracy so as to determine a model training effect according to output contents.
7. The method of claim 6, wherein determining a first accuracy of the existing deep neural network model from a first set of antagonizing samples and a second accuracy of the deep neural network model obtained after training is complete from a second set of antagonizing samples comprises:
respectively inputting each first anti sample in the first anti sample set into the existing deep neural network model to obtain a prediction result which is respectively output by the existing deep neural network model aiming at each first anti sample; comparing each prediction result with the label information corresponding to each first anti sample to determine the prediction accuracy of the existing deep neural network model, and taking the prediction accuracy as first accuracy;
and respectively inputting each second antagonizing sample in the second antagonizing sample set into the deep neural network model obtained after training is finished, obtaining a prediction result which is respectively output by the deep neural network model obtained after training aiming at each second antagonizing sample, determining the prediction accuracy of the deep neural network model obtained after training is finished by comparing each prediction result with the label information corresponding to each second antagonizing sample, and taking the prediction accuracy as a second accuracy.
8. A model parameter adjustment apparatus comprising:
the characteristic parameter acquisition module is used for acquiring each input characteristic parameter of a set network layer contained in the deep neural network model when the deep neural network model is trained by using current sample data;
the sorting and screening module is used for sorting the input characteristic parameters and selecting at least one input characteristic parameter with the minimum numerical value according to a sorting result;
and the adjusting module is used for reducing the numerical value of each selected input characteristic parameter.
9. The apparatus of claim 8, wherein the adjustment module is specifically configured to:
and adjusting the value of each selected input characteristic parameter to be 0.
10. The apparatus of claim 8, wherein the rank filter module comprises:
and the screening unit is used for selecting N input characteristic parameters with the minimum numerical values according to the sorting result, wherein N is equal to an integer value obtained by multiplying the total number of the input characteristic parameters by a preset proportional value.
11. The apparatus of claim 8, wherein the apparatus further comprises:
the judging module is used for judging whether iterative training of set times is finished or whether the accuracy of the deep neural network model is not lower than a preset value;
the training ending module is used for determining that the training of the deep neural network model is ended if the judgment result is yes;
and the continuous training module is used for returning to execute the operation of setting each input characteristic parameter of the network layer contained in the deep neural network model when the deep neural network model is trained by using the current sample data after taking another sample data as the current sample data if the judgment result is negative.
12. The apparatus of claim 8, wherein the deep neural network model is a convolutional neural network model, and the set network layers comprise convolutional layers and/or fully-connected layers.
13. The apparatus of any of claims 8-12, wherein the apparatus further comprises:
the verification sample acquisition module is used for acquiring a verification sample set;
the first anti-sample generation module is used for generating a first anti-sample set based on the deep neural network model obtained after the verification sample set and the training are finished;
the second antagonizing sample generation module is used for generating a second antagonizing sample set based on the verification sample set and the existing deep neural network model; the existing deep neural network model is obtained by adopting an existing training mode, wherein the existing training mode is a training mode in which input characteristic parameters of a network layer are not adjusted in the training process;
and the effect verification module is used for determining first accuracy of the existing deep neural network model according to the first antagonizing sample set, determining second accuracy of the deep neural network model obtained after training is finished according to the second antagonizing sample set, and outputting the first accuracy and the second accuracy so as to determine a model training effect according to output contents.
14. The apparatus of claim 13, wherein the effect verification module comprises:
the first accuracy determining unit is used for respectively inputting each first anti sample in the first anti sample set into the existing deep neural network model and obtaining a prediction result which is respectively output by the existing deep neural network model aiming at each first anti sample; comparing each prediction result with the label information corresponding to each first anti sample to determine the prediction accuracy of the existing deep neural network model, and taking the prediction accuracy as first accuracy;
and the second accuracy determining unit is used for respectively inputting each second antagonizing sample in the second antagonizing sample set into the deep neural network model obtained after training is finished, obtaining a prediction result which is respectively output by the deep neural network model obtained after training aiming at each second antagonizing sample, determining the prediction accuracy of the deep neural network model obtained after training is finished by comparing each prediction result with the label information corresponding to each second antagonizing sample, and taking the prediction accuracy as the second accuracy.
15. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the model parameter adjustment method of any one of claims 1-7.
16. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the model parameter adjustment method according to any one of claims 1 to 7.
CN202010544321.3A 2020-06-15 2020-06-15 Model parameter adjusting method and device, electronic equipment and storage medium Pending CN111753955A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010544321.3A CN111753955A (en) 2020-06-15 2020-06-15 Model parameter adjusting method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010544321.3A CN111753955A (en) 2020-06-15 2020-06-15 Model parameter adjusting method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111753955A true CN111753955A (en) 2020-10-09

Family

ID=72674929

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010544321.3A Pending CN111753955A (en) 2020-06-15 2020-06-15 Model parameter adjusting method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111753955A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115881212A (en) * 2022-10-26 2023-03-31 溪砾科技(深圳)有限公司 RNA target-based small molecule compound screening method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115881212A (en) * 2022-10-26 2023-03-31 溪砾科技(深圳)有限公司 RNA target-based small molecule compound screening method and device

Similar Documents

Publication Publication Date Title
CN111667054A (en) Method and device for generating neural network model, electronic equipment and storage medium
CN111078878B (en) Text processing method, device, equipment and computer readable storage medium
CN111667056A (en) Method and apparatus for searching model structure
CN110704509A (en) Data classification method, device, equipment and storage medium
CN111563593B (en) Training method and device for neural network model
CN111460384B (en) Policy evaluation method, device and equipment
CN111582375A (en) Data enhancement strategy searching method, device, equipment and storage medium
CN111461343B (en) Model parameter updating method and related equipment thereof
CN111680517A (en) Method, apparatus, device and storage medium for training a model
CN114861886B (en) Quantification method and device of neural network model
CN112561056A (en) Neural network model training method and device, electronic equipment and storage medium
CN111325000B (en) Language generation method and device and electronic equipment
CN110555486B (en) Model structure delay prediction method and device and electronic equipment
CN111241838A (en) Text entity semantic relation processing method, device and equipment
CN112560499A (en) Pre-training method and device of semantic representation model, electronic equipment and storage medium
CN111967591A (en) Neural network automatic pruning method and device and electronic equipment
CN112580723A (en) Multi-model fusion method and device, electronic equipment and storage medium
CN113657468A (en) Pre-training model generation method and device, electronic equipment and storage medium
CN111753955A (en) Model parameter adjusting method and device, electronic equipment and storage medium
CN110909390B (en) Task auditing method and device, electronic equipment and storage medium
CN111738325A (en) Image recognition method, device, equipment and storage medium
CN111680599A (en) Face recognition model processing method, device, equipment and storage medium
CN111553169A (en) Pruning method and device of semantic understanding model, electronic equipment and storage medium
CN114943228B (en) Training method of end-to-end sensitive text recall model and sensitive text recall method
CN112507692B (en) Method and device for establishing style text generation model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20201009

WD01 Invention patent application deemed withdrawn after publication