CN117056957A - Verifiable data forgetting privacy protection method and device for minimum and maximum learning model - Google Patents
Verifiable data forgetting privacy protection method and device for minimum and maximum learning model Download PDFInfo
- Publication number
- CN117056957A CN117056957A CN202310498860.1A CN202310498860A CN117056957A CN 117056957 A CN117056957 A CN 117056957A CN 202310498860 A CN202310498860 A CN 202310498860A CN 117056957 A CN117056957 A CN 117056957A
- Authority
- CN
- China
- Prior art keywords
- forgetting
- minimum
- data
- learning model
- maximum
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 239000011159 matrix material Substances 0.000 claims abstract description 45
- 230000006870 function Effects 0.000 claims description 33
- 238000012217 deletion Methods 0.000 claims description 29
- 230000037430 deletion Effects 0.000 claims description 29
- 230000008569 process Effects 0.000 claims description 11
- 238000012549 training Methods 0.000 claims description 7
- 239000000654 additive Substances 0.000 claims description 3
- 230000000996 additive effect Effects 0.000 claims description 3
- 238000010801 machine learning Methods 0.000 abstract description 15
- 238000012545 processing Methods 0.000 abstract description 10
- 230000007246 mechanism Effects 0.000 abstract description 8
- 230000000694 effects Effects 0.000 abstract description 2
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 230000002441 reversible effect Effects 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Bioethics (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application discloses a verifiable data forgetting privacy protection method and device for a very small and very large learning model, which are based on a Quan Haisen curvature matrix, carry out Newton step update on parameters of the very small and very large model and add random disturbance, so as to remove influence of forgotten data from the model, thereby realizing effective and verifiable machine learning model forgetting and approximately achieving the effect of retraining on the residual data. The application provides a verifiable machine learning model forgetting method aiming at the extremely small and extremely large problem for the first time, fully utilizes the parameters and data of the trained model to obtain new parameters updated by a model forgetting mechanism, avoids the high computational overhead of retraining, and protects the data privacy while processing the user data deleting request.
Description
Technical Field
The application relates to the field of machine learning privacy protection, in particular to a verifiable data forgetting privacy protection method and device for an extremely small and extremely large learning model.
Background
Machine learning is an important branch of artificial intelligence, and is to enable a computer system to automatically learn patterns and rules from data through learning and analysis of the data, so that the computer system can perform tasks such as prediction, classification, identification and the like autonomously. Machine learning algorithms generate predictive models by analyzing large amounts of user data from which rules and patterns are found. These models can be applied in a variety of different fields such as natural language processing, image recognition, recommendation systems, predictive analysis, and the like. The very small and very large learning model (Minimax Learning Model) is a model commonly used in game theory and machine learning that attempts to find a state of equilibrium among multiple decision makers (also called players) so that each decision maker can get the best results in the worst case. Very small and very large learning models are widely used in the field of machine learning, including challenge-generation networks, robust learning, challenge training, algorithmic fairness, markov decision processes, and so forth.
In practice, machine learning models may use a number of sensitive data to train, such as medical records, financial data, personal identity information, etc., so protecting user privacy becomes critical. A series of laws and regulations for protecting privacy of data have been put under way in recent years at home and abroad, and important legal guarantee is provided for personal information, particularly sensitive personal information, including deletion rights (also called forgetting rights) for personal information. These specifications require deletion of personal data upon user request and may even include deletion of models and algorithms extracted from the user data. Although deleting target data from the database in which the training data set resides is relatively easy to implement, merely staying at this step does not ensure that the machine learning model that has been trained and deployed on this data set is able to adequately adhere to the rules of deletion rights. In fact, if the trained model is not updated to forget the target deletion training data from the model, the machine learning model still risks revealing the target deletion data privacy. Therefore, further machine learning model data forgetting updating of the trained model is required to ensure that the model does not reveal personal privacy information in subsequent use.
One of the simplest ways to forget the machine learning model data is to retrain the model on the new data set after the target deletion data is removed, but this approach incurs high computational overhead and time costs. In recent years, related researches on model forgetting propose a series of forgetting mechanism designs based on different theoretical ideas and technical routes so as to avoid a retraining mode. Model forgetting can be roughly classified into accurate model forgetting and approximate model forgetting according to the nature of the model forgetting mechanism. The accurate model forgetting mechanism refers to a model updated by the mechanism, and the model is completely consistent with a model obtained by retraining, which means that the model forgetting mechanism can completely remove information related to target deletion data. The approximate model forgetting means that the model updated by the mechanism is approximately the same as the model obtained by retraining, which means that the model forgetting mechanism approximately clears the information related to the target deletion data. Verifiable model forgetting refers to ensuring that after data is deleted, the machine learning model operates as if the deleted data was never observed.
However, the existing machine learning model forgetting methods are limited to standard learning models, only optimization of univariate parameters is considered, and data forgetting methods for extremely small and extremely large learning models containing bivariate parameters are not considered.
Disclosure of Invention
The application aims to provide a verifiable data forgetting privacy protection method and device of a very small and very large learning model aiming at the defects of the prior art. The application can realize approximate model forgetting containing bivariate parameters.
The aim of the application is realized by the following technical scheme: the first aspect of the embodiment of the application provides a verifiable data forgetting privacy protection method of a very small and very large learning model, which comprises the following steps:
(1) Calculating the average value of all sample loss functions aiming at an original data set to acquire experience risks, and training a minimum and maximum learning model to acquire an optimal solution enabling the experience risks to be minimum and maximum as maximum parameters and minimum parameters of the minimum and maximum learning model;
(2) Calculating a Quan Haisen matrix at the optimal solution obtained in said step (1), said Quan Haisen matrix comprising a direct hessian matrix portion and an indirect hessian matrix portion;
(3) According to the optimal solution obtained in the step (1), the Quan Haisen matrix obtained in the step (2) and the data deleting request of the user, carrying out Newton step forgetting updating on the maximum parameter and the minimum parameter of the minimum maximum learning model so as to obtain the updated minimum parameter and the maximum parameter;
(4) And (3) adding Gaussian noise to the updated minimum parameter and maximum parameter obtained in the step (3) as random disturbance to obtain a final forgetting model, and finishing verifiable data forgetting privacy protection according to the forgetting model.
Further, the specific process of obtaining the optimal solution with extremely small experience risk in the step (1) is as follows:
where n is the size of the original dataset S, z i For the ith data sample in the dataset, F (·) is the loss function, F S (. Cndot.) is the empirical risk on the original dataset, w and v represent the minimum and maximum parameters of the minimum and maximum learning model to be learned respectively,minimum parameters for minimizing experience risk, < ->To maximize the experience risk.
Further, the specific process of calculating the Quan Haisen matrix at the optimal solution obtained in the step (1) in the step (2) is as follows:
wherein,quan Haisen matrix representing minimum and maximum parameters at optimal solution, respectively, +.>Respectively represent an empirical risk function F S Second partial derivatives with respect to w and v, +.>Representing an empirical risk function F s Sequentially solving the bias guide for w twice, and +.>Representing an empirical risk function F s First, bias is determined for w and then bias is determined for v, and then the bias is determined for v>Representing an empirical risk function F s Partial derivative of v and then of w is first calculated, and the first part is the first part of the partial derivative of w>Representing an empirical risk function F s And (5) solving the partial derivatives of v twice.
Further, the step (3) includes the following substeps:
(3.1) constructing a deletion request data set according to a user's data deletion request, and according to the deletion request data set, utilizing the optimal solution with extremely small experience risk on the original data set obtained in the step (1)And Quan Haisen matrix of empirical risk at optimal solution obtained in said step (2)>And->Calculating Quan Haisen matrix TH at optimal solution on remaining dataset w And TH v :
Where n is the size of the original dataset, m is the size of the delete request dataset U, z i An ith data sample in the data set;
(3.2) using the optimal solution obtained in step (1) with minimal risk of experience on the raw datasetAnd Quan Haisen matrix TH on the remaining data set obtained in said step (3.1) w And TH v Newton step forgetting update is carried out on the minimum parameter and the maximum parameter so as to obtain the updated minimum parameter and maximum parameter +.>
Wherein,and->Representing the first derivatives with respect to w and v, respectively, n being the size of the original dataset, U being the deletion request dataset, m being the size of the deletion request dataset U.
Further, the specific process of adding the gaussian noise to the updated minimum parameter and maximum parameter obtained in the step (3) in the step (4) is as follows:
wherein w is u 、v u Obtaining a final forgetting model; zeta type toy 1 、ξ 2 Respectively represent the updated minimum parametersAnd updated maximum parameter +.>Additive Gaussian noise->I is an identity matrix; sigma (sigma) 1 、σ 2 The standard deviation of gaussian noise distribution, respectively.
The second aspect of the embodiment of the application provides a verifiable data forgetting privacy protection device of a very small and very large learning model, which comprises one or more processors and is used for realizing the verifiable data forgetting privacy protection method of the very small and very large learning model.
A third aspect of the embodiments of the present application provides a computer-readable storage medium having stored thereon a program which, when executed by a processor, is configured to implement the verifiable data forgetting privacy preserving method of the extremely small and extremely large learning model described above.
The application has the beneficial effects that Newton step updating is carried out on the existing model parameters by calculating the Quan Haisen matrix, and well-designed random disturbance is added to achieve verifiable deletion assurance, so that the memory cost is low, and the effect of retraining on the residual data can be approximately achieved; the application provides a verifiable data forgetting privacy protection method of a very small and very large learning model for the first time, avoids high computational overhead of retraining, and can protect the data privacy of a user.
Drawings
FIG. 1 is a schematic diagram of the overall flow of a verifiable data forgetting privacy protection method of an extremely small and extremely large learning model of the application;
fig. 2 is a schematic structural diagram of a verifiable data forgetting privacy protection device of the very small and very large learning model.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the application. The term "if" as used herein may be interpreted as "at..once" or "when..once" or "in response to a determination", depending on the context.
The verifiable data forgetting of the minimum and maximum learning model can be classified as approximate model forgetting, and the core technology is that Newton step forgetting update is carried out on the existing model parameters based on a Quan Haisen matrix, and random disturbance is added on the updated model parameters so as to realize the privacy protection of the verifiable data forgetting.
The very small and very large learning model (Minimax Learning Model) is a machine learning model for solving the problem of two-person zero and game play. In this model, two opponents achieve their own goals by constantly alternating actions, i.e. minimizing their own losses or maximizing their own profits. One of the opponents is called a Minimizer (Minimizer), and the other opponent is called a Maximizer (maxiizer).
Referring to fig. 1, the verifiable data forgetting privacy protection method of the very small and very large learning model, disclosed by the application, carries out newton step update on the parameters of the existing very small and very large learning model based on a Quan Haisen curvature matrix and adds carefully designed random disturbance to achieve verifiable deletion assurance, and specifically comprises the following steps:
(1) For the original data set, calculating the average value of all sample loss functions to obtain experience risks, and training a minimum and maximum learning model to obtain an optimal solution enabling the experience risks to be minimum and maximum as maximum parameters and minimum parameters of the minimum and maximum learning model.
In this embodiment, for the original dataset, the empirical risk is obtained by calculating the sum of all sample loss functions and then averaging, and the minimum and maximum learning model is trained and optimized by the learning algorithm to obtain the optimal solution for making the empirical risk minimum and maximumThe maximum parameter and the minimum parameter are the minimum and maximum learning model. It should be understood that the very small and very large learning model may be trained and optimized by a random gradient descent method, a grid search method, or the like, or may be trained and optimized by other methods, as long as an optimal solution that makes experience risk very large can be calculated. Wherein the optimal solution obtained with minimal risk of experience is taken as a minimal parameter +.>The resulting optimal solution, which makes experience risk greatest, is taken as a maximum parameter +.>The specific process is as follows:
where n is the size of the original dataset S, z i For the ith data sample in the dataset, F (·) is the loss function, F S (. Cndot.) is the empirical risk on the original dataset, w and v represent the minimum and maximum parameters of the minimum and maximum learning model to be learned respectively,minimum parameters for minimizing experience risk, < ->To maximize the experience risk.
It should be understood that f (·) is a loss function, the loss function may be determined according to a specific problem, the sample in the original data set is used to train the very small and very large learning model, the corresponding loss may be calculated according to the training result and the result corresponding to the original data set, and the average value of all the losses may be further calculated to obtain the experience risk.
In this embodiment, there is a nested influence between two parameters of the very small and very large learning model, and the very small parameter W is a function dependent on the very large parameter v, and can be expressed as W S (v):=argmin w F S (w, v) wherein: =representation definition, the function representation defines W S (v) A function; also, the maximum parameter V is a function dependent on the minimum parameter w, and can be expressed as V S (w):=armax v F S (w, V) the functional representation defines V S (w) function.
(2) Calculating a Quan Haisen matrix at the optimal solution obtained in step (1), wherein the Quan Haisen matrix comprises a direct hessian matrix portion and an indirect hessian matrix portion.
In this embodiment, the Quan Haisen matrix includes two parts: a direct hessian matrix section and an indirect hessian matrix section. On the original data set, for the minimum parameter w and the maximum parameter v of the minimum and maximum learning model, the direct hessian matrix part comprisesAnd->The indirect hessian matrix part comprises-> And->Thus, in the optimal solution->Quan Haisen matrix of places->And->The expression of (2) is:
wherein,quan Haisen matrix representing minimum and maximum parameters at optimal solution, respectively, +.>Respectively represent the functions F S With respect to the second partial derivatives of w and v,representing a function F s Sequentially solving the bias guide for w twice, and +.>Representing a function F s First, bias is determined for w and then bias is determined for v, and then the bias is determined for v>Represents F s Representing a function F s Partial derivative of v and then of w is first calculated, and the first part is the first part of the partial derivative of w>Representing a function F s Solving the partial derivatives of v twice in sequence; when (when)And->Reversible time, ->And->Respectively->And->Is a shorthand method of (c).
It should be understood that in this embodiment, only the operation results of step (1) and step (2) need be stored, and the complete original data set need not be stored, and the storage overhead is independent of the size of the original data set.
(3) And (3) carrying out Newton step forgetting updating on the maximum parameters and the minimum parameters of the minimum maximum learning model according to the optimal solution obtained in the step (1), the Quan Haisen matrix obtained in the step (2) and the data deleting request of the user so as to obtain the updated minimum parameters and the updated maximum parameters.
(3.1) constructing a deletion request data set according to the data deletion request of the user, and utilizing the optimal solution with extremely small experience risk on the original data set obtained in the step (1) according to the deletion request data setAnd Quan Haisen matrix of empirical risk at optimal solution obtained in step (2)>And->Calculating Quan Haisen matrix TH at optimal solution on remaining dataset w And TH v :
Where n is the size of the original dataset, m is the size of the delete request dataset U, z i For the ith data sample in the data set,and->The calculation method of (2) and +.>Andthe same way of calculation.
It should be appreciated that the original data set contains the delete request data set, and thus, the data in the original data set minus the data in the delete request data set is the remaining data, so that the remaining data set can be constructed.
(3.2) using the optimal solution obtained in step (1) with very little risk of experience on the raw datasetAnd Quan Haisen matrix TH on the remaining data set obtained in step (3.1) w And TH v Newton step forgetting update is carried out on the minimum parameter and the maximum parameter so as to obtain the updated minimum parameter and maximum parameter +.>
Wherein,and->Representing the first derivatives with respect to w and v, respectively, n being the size of the original dataset, U being the deletion request dataset, m being the size of the deletion request dataset, when TH w And TH v Reversible, TH w -1 And TH v -1 Respectively represent TH w And TH v Is the inverse of (a).
The forgetting update step of the minimum and maximum parameters of the minimum and maximum learning model is a newton step composed of the sum of the gradient of the loss function on the target deletion data point and the average value of the sum of the Quan Haisen matrix on all the remaining points is used as the curvature. The nested influence between the very small parameter w and the very large parameter v can be captured using the full hessian matrix instead of the simple direct hessian matrix.
(4) And (3) adding Gaussian noise to the updated minimum parameter and maximum parameter obtained in the step (3) as random disturbance to obtain a final forgetting model, and finishing verifiable data forgetting privacy protection according to the forgetting model.
In this embodiment, the specific process of adding gaussian noise to the updated minimum parameter and maximum parameter obtained in the step (3) is as follows:
wherein w is u 、v u Obtaining a final forgetting model; zeta type toy 1 、ξ 2 Respectively represent the updated minimum parametersAnd updated maximum parameter +.>Additive Gaussian noise->I is an identity matrix; sigma (sigma) 1 、σ 2 Standard deviation of gaussian noise distribution, +.>Wherein γ1 and γ2 are jointly determined by the nature of the loss function f (·), the original dataset size n and the target deleted dataset size m; epsilon and delta each represent a self-defined privacy parameter.
In this embodiment, by adding random disturbance, a verifiable deletion guarantee is achieved, that is, after data deletion, the operation of the minimum and maximum learning model is guaranteed as if the deleted data were never observed.
Corresponding to the embodiment of the verifiable data forgetting privacy protection method of the extremely small and extremely large learning model, the application also provides the embodiment of the verifiable data forgetting privacy protection device of the extremely small and extremely large learning model.
Referring to fig. 2, the verifiable data forgetting privacy protection device of the very small and very large learning model provided by the embodiment of the application comprises one or more processors, and is used for realizing the verifiable data forgetting privacy protection method of the very small and very large learning model in the embodiment.
The embodiment of the verifiable data forgetting privacy protection device of the minimum and maximum learning model can be applied to any device with data processing capability, such as a computer. The apparatus embodiments may be implemented by software, or may be implemented by hardware or a combination of hardware and software. Taking software implementation as an example, the device in a logic sense is formed by reading corresponding computer program instructions in a nonvolatile memory into a memory by a processor of any device with data processing capability. In terms of hardware, as shown in fig. 2, a hardware structure diagram of an apparatus with data processing capability where the verifiable data forgetting privacy protection apparatus of the very small and very large learning model of the present application is located is shown in fig. 2, and in addition to the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 2, the apparatus with data processing capability in the embodiment generally includes other hardware according to the actual function of the apparatus with data processing capability, which is not described herein.
The implementation process of the functions and roles of each unit in the above device is specifically shown in the implementation process of the corresponding steps in the above method, and will not be described herein again.
For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present application. Those of ordinary skill in the art will understand and implement the present application without undue burden.
The embodiment of the application also provides a computer readable storage medium, on which a program is stored, which when executed by a processor, implements the verifiable data forgetting privacy protection method of the very small and very large learning model in the above embodiment.
The computer readable storage medium may be an internal storage unit, such as a hard disk or a memory, of any of the data processing enabled devices described in any of the previous embodiments. The computer readable storage medium may be any device having data processing capability, for example, a plug-in hard disk, a Smart Media Card (SMC), an SD Card, a Flash memory Card (Flash Card), or the like, which are provided on the device. Further, the computer readable storage medium may include both internal storage units and external storage devices of any data processing device. The computer readable storage medium is used for storing the computer program and other programs and data required by the arbitrary data processing apparatus, and may also be used for temporarily storing data that has been output or is to be output.
The above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.
Claims (7)
1. The verifiable data forgetting privacy protection method of the minimum and maximum learning model is characterized by comprising the following steps of:
(1) Calculating the average value of all sample loss functions aiming at an original data set to acquire experience risks, and training a minimum and maximum learning model to acquire an optimal solution enabling the experience risks to be minimum and maximum as maximum parameters and minimum parameters of the minimum and maximum learning model;
(2) Calculating a Quan Haisen matrix at the optimal solution obtained in said step (1), said Quan Haisen matrix comprising a direct hessian matrix portion and an indirect hessian matrix portion;
(3) According to the optimal solution obtained in the step (1), the Quan Haisen matrix obtained in the step (2) and the data deleting request of the user, carrying out Newton step forgetting updating on the maximum parameter and the minimum parameter of the minimum maximum learning model so as to obtain the updated minimum parameter and the maximum parameter;
(4) And (3) adding Gaussian noise to the updated minimum parameter and maximum parameter obtained in the step (3) as random disturbance to obtain a final forgetting model, and finishing verifiable data forgetting privacy protection according to the forgetting model.
2. The verifiable data forgetting privacy protection method of a very small and very large learning model according to claim 1, wherein the specific process of obtaining the optimal solution for making the experience risk very small and very large in the step (1) is as follows:
where n is the size of the original dataset S, z i For the ith data sample in the dataset, f (·) is the loss function, S (. Being an empirical risk on the original dataset, w and v represent the minimum and maximum parameters respectively of the minimum and maximum learning model to be learned,minimum parameters for minimizing experience risk, < ->To maximize the experience risk.
3. The verifiable data forgetting privacy protection method of a very small and very large learning model according to claim 1, wherein the specific process of calculating the Quan Haisen matrix at the optimal solution obtained in step (1) in step (2) is as follows:
wherein,quan Haisen matrix representing minimum and maximum parameters at optimal solution, respectively, +.>Respectively represent an empirical risk function F S With respect to the second partial derivatives of w and v,representing an empirical risk function F s Sequentially solving the bias guide for w twice, and +.>Representing an empirical risk function F s First, bias is determined for w and then bias is determined for v, and then the bias is determined for v>Representing an empirical risk function F s Partial derivative of v and then of w is first calculated, and the first part is the first part of the partial derivative of w>Representing an empirical risk function F s And (5) solving the partial derivatives of v twice.
4. The verifiable data forgetting privacy protection method of a very small and very large learning model according to claim 1, characterized in that said step (3) comprises the sub-steps of:
(3.1) constructing a deletion request data set according to a user's data deletion request, and according to the deletion request data set, utilizing the optimal solution with extremely small experience risk on the original data set obtained in the step (1)And Quan Haisen matrix of empirical risk at optimal solution obtained in said step (2)>And->Calculating Quan Haisen matrix TH at optimal solution on remaining dataset w And TH v :
Where n is the size of the original dataset, m is the size of the delete request dataset U, z i An ith data sample in the data set;
(3.2) using the optimal solution obtained in step (1) with minimal risk of experience on the raw datasetAnd Quan Haisen matrix TH on the remaining data set obtained in said step (3.1) w And TH v For extremely small and large parametersCarrying out Newton step forgetting update to obtain updated minimum parameter and maximum parameter +.>
Wherein,and->Representing the first derivatives with respect to w and v, respectively, n being the size of the original dataset, U being the deletion request dataset, m being the size of the deletion request dataset U.
5. The verifiable data forgetting privacy protection method of a very small and very large learning model according to claim 1, wherein the specific process of adding gaussian noise to the updated very small parameters and the very large parameters obtained in the step (3) in the step (4) is as follows:
wherein w is u 、v u Obtaining a final forgetting model; zeta type toy 1 、ξ 2 Respectively represent the updated minimum parametersAnd updated maximum parameter +.>Additive Gaussian noise->I is an identity matrix; sigma (sigma) 1 、σ 2 The standard deviation of gaussian noise distribution, respectively.
6. A verifiable data forgetting privacy protection device of a very small and very large learning model, characterized by comprising one or more processors for implementing the verifiable data forgetting privacy protection method of a very small and very large learning model as claimed in any one of claims 1-5.
7. A computer-readable storage medium, having stored thereon a program which, when executed by a processor, is adapted to carry out the verifiable data forgetting privacy protection method of the very small and very large learning model of any one of claims 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310498860.1A CN117056957A (en) | 2023-05-06 | 2023-05-06 | Verifiable data forgetting privacy protection method and device for minimum and maximum learning model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310498860.1A CN117056957A (en) | 2023-05-06 | 2023-05-06 | Verifiable data forgetting privacy protection method and device for minimum and maximum learning model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117056957A true CN117056957A (en) | 2023-11-14 |
Family
ID=88654149
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310498860.1A Pending CN117056957A (en) | 2023-05-06 | 2023-05-06 | Verifiable data forgetting privacy protection method and device for minimum and maximum learning model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117056957A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117390685A (en) * | 2023-12-07 | 2024-01-12 | 湖北省楚天云有限公司 | Pedestrian re-identification data privacy protection method and system based on forgetting learning |
CN117892843A (en) * | 2024-03-18 | 2024-04-16 | 中国海洋大学 | Machine learning data forgetting method based on game theory and cryptography |
-
2023
- 2023-05-06 CN CN202310498860.1A patent/CN117056957A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117390685A (en) * | 2023-12-07 | 2024-01-12 | 湖北省楚天云有限公司 | Pedestrian re-identification data privacy protection method and system based on forgetting learning |
CN117390685B (en) * | 2023-12-07 | 2024-04-05 | 湖北省楚天云有限公司 | Pedestrian re-identification data privacy protection method and system based on forgetting learning |
CN117892843A (en) * | 2024-03-18 | 2024-04-16 | 中国海洋大学 | Machine learning data forgetting method based on game theory and cryptography |
CN117892843B (en) * | 2024-03-18 | 2024-06-04 | 中国海洋大学 | Machine learning data forgetting method based on game theory and cryptography |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN117056957A (en) | Verifiable data forgetting privacy protection method and device for minimum and maximum learning model | |
CN111177792B (en) | Method and device for determining target business model based on privacy protection | |
CN108111489B (en) | URL attack detection method and device and electronic equipment | |
KR102308871B1 (en) | Device and method to train and recognize object based on attribute of object | |
CN111523422B (en) | Key point detection model training method, key point detection method and device | |
CN107844784A (en) | Face identification method, device, computer equipment and readable storage medium storing program for executing | |
CN111523621A (en) | Image recognition method and device, computer equipment and storage medium | |
CN108229679A (en) | Convolutional neural networks de-redundancy method and device, electronic equipment and storage medium | |
CN109271958A (en) | The recognition methods of face age and device | |
US20190354100A1 (en) | Bayesian control methodology for the solution of graphical games with incomplete information | |
US9292801B2 (en) | Sparse variable optimization device, sparse variable optimization method, and sparse variable optimization program | |
CN114417427A (en) | Deep learning-oriented data sensitivity attribute desensitization system and method | |
CN111144369A (en) | Face attribute identification method and device | |
Glauner | Comparison of training methods for deep neural networks | |
CN113487039A (en) | Intelligent body self-adaptive decision generation method and system based on deep reinforcement learning | |
CN112738098A (en) | Anomaly detection method and device based on network behavior data | |
CN109101984B (en) | Image identification method and device based on convolutional neural network | |
Dalle Pezze et al. | A multi-label continual learning framework to scale deep learning approaches for packaging equipment monitoring | |
Chang et al. | Agent embeddings: a latent representation for pole-balancing networks | |
Jeon et al. | Scalable multi-agent inverse reinforcement learning via actor-attention-critic | |
CN116030502A (en) | Pedestrian re-recognition method and device based on unsupervised learning | |
CN113627404B (en) | High-generalization face replacement method and device based on causal inference and electronic equipment | |
Gholamalinejad et al. | Whitened gradient descent, a new updating method for optimizers in deep neural networks | |
CN116109853A (en) | Task processing model training method, task processing method, device and equipment | |
CN113837294A (en) | Model training and calling method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |