CN118171303A - Model data storage method, apparatus, electronic device, and computer-readable medium - Google Patents

Model data storage method, apparatus, electronic device, and computer-readable medium Download PDF

Info

Publication number
CN118171303A
CN118171303A CN202410442426.6A CN202410442426A CN118171303A CN 118171303 A CN118171303 A CN 118171303A CN 202410442426 A CN202410442426 A CN 202410442426A CN 118171303 A CN118171303 A CN 118171303A
Authority
CN
China
Prior art keywords
model
information
data set
encryption
training data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410442426.6A
Other languages
Chinese (zh)
Inventor
李涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Park Road Credit Information Co ltd
Original Assignee
Park Road Credit Information Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Park Road Credit Information Co ltd filed Critical Park Road Credit Information Co ltd
Priority to CN202410442426.6A priority Critical patent/CN118171303A/en
Publication of CN118171303A publication Critical patent/CN118171303A/en
Pending legal-status Critical Current

Links

Landscapes

  • Storage Device Security (AREA)

Abstract

Embodiments of the present disclosure disclose model data storage methods, apparatus, electronic devices, and computer readable media. One embodiment of the method comprises the following steps: acquiring a model training data set; determining importance degree information of a data set; determining an encryption mode; performing data encryption processing on each model training data to generate encryption information; determining at least one usage model; encrypting the at least one model source file to generate at least one encrypted model source file information; setting at least one connection identifier; the model training data set and the encryption information are stored in a first storage center, and the at least one usage model and the at least one encryption model source file information are stored in a second storage center. The embodiment can realize encryption processing of the model training data set and at least one corresponding usage model, effectively carry out secret protection of the model training data and the model source file, and is convenient for the subsequent use of the training data and the model source file.

Description

Model data storage method, apparatus, electronic device, and computer-readable medium
Technical Field
Embodiments of the present disclosure relate to the field of computer technology, and in particular, to a model data storage method, apparatus, electronic device, and computer readable medium.
Background
At present, with the continuous development of artificial intelligence, the training of models is also a main development direction in artificial intelligence. For model training, the secret protection of the model training data set and the model source file is one aspect that needs to be emphasized in the model training process and the model application process. For the steganography protection of model training data and model source files, the following methods are generally adopted: the model training data and model source files are steganographically protected directly using a related desensitization algorithm (e.g., AES algorithm, differential privacy algorithm).
However, when the model training data and the model source file are subjected to steganography protection in the above manner, the following technical problems often exist:
the related desensitization algorithm logic is simpler, and the common retrieval and use between the desensitized model training data and the desensitized model source file are more troublesome, so that the application efficiency of the model training data and the model source file is lower.
In the process of solving the first technical problem by adopting the technical scheme, the following technical problems are often accompanied: how to implement encryption processing of data based on a neural network is currently the main development direction. In view of the above technical problems, conventional solutions are generally: the encryption processing of the data is realized through the BP neural network. However, the above conventional solution still has the following two problems: the accuracy of the obtained encrypted data is not accurate enough.
The above information disclosed in this background section is only for enhancement of understanding of the background of the inventive concept and, therefore, may contain information that does not form the prior art that is already known to those of ordinary skill in the art in this country.
Disclosure of Invention
The disclosure is in part intended to introduce concepts in a simplified form that are further described below in the detailed description. The disclosure is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Some embodiments of the present disclosure propose a model data storage method, apparatus, electronic device, and computer readable medium to solve one or more of the technical problems mentioned in the background section above.
In a first aspect, some embodiments of the present disclosure provide a model data storage method, including: acquiring a model training data set; determining importance degree information of a data set corresponding to the model training data set; determining an encryption mode corresponding to the model training data set according to the importance degree information of the data set; according to the encryption mode, carrying out data encryption processing on each model training data in the model training data set to generate encryption information; determining at least one usage model corresponding to the model training data set; encrypting at least one model source file corresponding to the at least one usage model to generate at least one encrypted model source file information; setting at least one connection identifier for the model training dataset and the at least one usage model; storing the model training data set and the encryption information in a first storage center, and storing the at least one usage model and the at least one encryption model source file information in a second storage center, wherein the first storage center and the second storage center communicate via a connection identification request, the connection identification request being generated based on a connection identification.
In a second aspect, some embodiments of the present disclosure provide a model data storage device comprising: an acquisition unit configured to acquire a model training dataset; a first determining unit configured to determine data set importance degree information corresponding to the model training data set; a second determining unit configured to determine an encryption mode corresponding to the model training data set according to the data set importance degree information; a data encryption processing unit configured to perform data encryption processing on each model training data in the model training data set according to the encryption manner to generate encryption information; a third determination unit configured to determine at least one usage model corresponding to the model training data set; an encryption unit configured to encrypt at least one model source file corresponding to the at least one usage model to generate at least one encrypted model source file information; a setting unit configured to set at least one connection identification for the model training data set and the at least one usage model; and a storage unit configured to store the model training data set and the encryption information in a first storage center, and store the at least one usage model and the at least one encryption model source file information in a second storage center, wherein the first storage center and the second storage center communicate via a connection identification request, the connection identification request being generated based on a connection identification.
In a third aspect, some embodiments of the present disclosure provide an electronic device comprising: one or more processors; a storage device having one or more programs stored thereon, which when executed by one or more processors, cause the one or more processors to implement the method as described in any of the implementations of the first aspect.
In a fourth aspect, some embodiments of the present disclosure provide a computer readable medium having a computer program stored thereon, wherein the program when executed by a processor implements a method as described in any of the implementations of the first aspect.
The above embodiments of the present disclosure have the following advantageous effects: by the model data storage method of some embodiments of the present disclosure, encryption processing of a model training data set and at least one corresponding usage model can be implemented, so that secret protection of the model training data and a model source file is effectively performed, and subsequent use of the training data and the model source file is facilitated. In particular, the reason for the fact that the associated training data and model source file encryption correspond to the privacy protection is not efficient enough and the subsequent use is not efficient enough is that: the related desensitization algorithm logic is simpler, and the common retrieval and use between the desensitized model training data and the desensitized model source file are more troublesome, so that the application efficiency of the model training data and the model source file is lower. Based on this, the model data storage method of some embodiments of the present disclosure first obtains a model training data set as a subsequent encrypted and correspondingly stored data object, facilitating use of the subsequent data set. Then, the data set importance degree information corresponding to the model training data set is determined. Here, the determined data set importance information may be used for the subsequent determination of the encryption scheme. And different data set importance degree information is selected to correspond to the encryption mode of the secrecy degree, so that the adaptive encryption of the resources is realized. Then, according to the importance degree information of the data set, the encryption mode corresponding to the model training data set can be accurately determined, so as to realize corresponding secrecy protection aiming at the importance degree of the model training data set. And then, according to the encryption mode, each model training data in the model training data set can be accurately subjected to data encryption processing so as to generate encryption information, and targeted privacy protection can be realized. Next, at least one usage model corresponding to the model training data set is determined, so that the usage of the subsequent model training data set and the retrieval of the corresponding usage model are facilitated. Further, encryption processing is performed on at least one model source file corresponding to the at least one usage model, so as to accurately generate at least one piece of encryption model source file information, and protection of the model source file is achieved. Further, at least one connection identifier for the model training dataset and the at least one usage model is set to facilitate subsequent usage of the model and retrieval of the model training dataset. Finally, the model training data set and the encryption information are stored in a first storage center, and the at least one usage model and the at least one encryption model source file information are stored in a second storage center. Wherein the first storage center and the second storage center communicate via a connection identification request, the connection identification request being generated based on a connection identification. Here, by means of the differential storage of the first storage center and the second storage center, an efficient storage and steganography of the model training data set and the usage model can be achieved. In addition, through the connection identification request, the model training data set and the selective acquisition of the use model can be realized, and the acquisition efficiency is greatly improved.
Drawings
The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.
FIG. 1 is a flow chart of some embodiments of a model data storage method according to the present disclosure;
FIG. 2 is a schematic structural diagram of some embodiments of a model data store according to the present disclosure;
Fig. 3 is a schematic structural diagram of an electronic device suitable for use in implementing some embodiments of the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.
It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings. Embodiments of the present disclosure and features of embodiments may be combined with each other without conflict.
It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.
It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.
The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.
The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Referring to FIG. 1, a flow 100 of some embodiments of a model data storage method according to the present disclosure is shown. The model data storage method comprises the following steps:
step 101, a model training dataset is obtained.
In some embodiments, the executing body of the model data storage method may acquire the model training data set through a wired connection manner or a wireless connection manner. The model training data set may be a training data set for training a neural network model, among other things. For example, the model training data set may be a face image training data set.
Step 102, determining importance degree information of the data set corresponding to the model training data set.
In some embodiments, the executing entity may determine data set importance information corresponding to the model training data set. The data set importance degree information may represent the data importance degree of each model training data in the model training data set. The data set importance information may be a number between 0 and 1. The larger the value, the higher the data importance of each model training data in the corresponding characterization model training data set.
As an example, first, the execution subject may determine the data quality and data content corresponding to the model training data set. Then, first importance degree information corresponding to the data quality is determined, and second importance degree information corresponding to the data content is determined. And finally, carrying out averaging processing on the first importance degree information and the second importance degree information to generate the importance degree information of the data set.
And step 103, determining an encryption mode corresponding to the model training data set according to the importance degree information of the data set.
In some embodiments, the executing entity may determine an encryption mode corresponding to the model training data set according to the importance information of the data set. The encryption scheme may be a scheme that characterizes how the model training data set is encrypted. In practice, the encryption means may include: encryption mode based on data set label and encryption mode of data content. The encryption method based on the data set label may be a method of performing label encryption on the data set label corresponding to the data set. Only if the correct encrypted data set label is input, the model training data set corresponding to the data set label can be obtained. The data content encryption method may be a method of encrypting each data in the data set.
As an example, the executing entity may determine the encryption mode corresponding to the importance information of the data set by using the information association table. The information association table may characterize a correspondence between the importance of the data set and the encryption mode.
And 104, carrying out data encryption processing on each model training data in the model training data set according to the encryption mode so as to generate encryption information.
In some embodiments, the executing body may execute encryption logic corresponding to an encryption manner to perform data encryption processing on each model training data in the model training data set to generate encryption information.
In some optional implementations of some embodiments, in response to determining that the encryption mode is a data set tag-based encryption mode, the executing entity may encrypt a data set tag corresponding to the model training data set using a pre-trained encoding and decoding model to generate the encryption information. Wherein, the data set label comprises: at least one keyword corresponding to the model training data set. The encoding and decoding models may be pre-trained models. In practice, the encoding and decoding model includes: coding and decoding models. The coding model may be a neural network model that data codes the respective input data. The input of the coding model may be specifically set according to the encryption manner. For example, the input to the coding model may be a dataset tag, or may be individual model training data. The output of the coding model may be a respective coding matrix corresponding to the respective model training data. The coding matrix may characterize data feature semantics of the corresponding model training data. The coding model may include: a plurality of concatenated convolutional layers. The decoding model may be a neural network model that decodes data for each input matrix. The inputs to the decoding model may be respective input matrices. For example, each input matrix may be each encoding matrix, and may also be an output result of the encryption model. The output of the decoding model may be respective decoding matrices corresponding to respective model training data. The decoding matrix can model the characteristic semantic information of the output target corresponding to the training data. For example, for a graph generation model, the encoding and decoding models are models used to generate a target scene graph. The output of the corresponding decoding model may characterize an output scene graph corresponding to the model training data. For example, the decoding model may include: a plurality of concatenated convolutional layers. The model training mode of the coding and decoding model can be a conventional training mode based on a gradient descent method.
As an example, first, the above-described execution body may input the data set labels to the encoding models among the encoding and decoding models to output the respective first encoding matrices. Each of the respective first coding matrices is then averaged pooled to generate a first pooled result set. And finally, splicing all the first pooling results in the first pooling result set to generate encryption information. In other words, in the training process of the coding and decoding model, model training is required to be performed based on the decoding model, and in the model application process, the output of the coding model is used as encryption basic data to realize the generation of encryption information.
In some optional implementations of some embodiments, in response to determining that the encryption mode is a data content encryption mode, data encryption processing is performed on each model training data in the model training data set using a pre-trained encoding and decoding model to generate an encrypted data set as the encryption information.
As an example, first, the execution body may input respective model training data to the coding model of the coding and decoding models to output respective second coding matrices. Each of the respective second coding matrices is then averaged pooled to generate a second pooled result set. And finally, splicing all the second pooling results in the second pooling result set to generate encryption information.
In some optional implementations of some embodiments, the encoding and decoding model includes: coding model, hash encryption-based encryption model and decoding model. The encryption model based on hash encryption can be a neural network model for realizing the hash encryption principle. In particular, the hash encryption-based encryption model may be a fully connected layer (Fully Connected Layer) of a multi-layer cascade. The full connection layer of the multi-layer cascade may be pre-trained based on multiple hash training data sets. The inputs to the plurality of concatenated fully connected layers may be a matrix or array to be hashed.
Optionally, encrypting the data set label corresponding to the model training data set by using a pre-trained encoding and decoding model to generate the encrypted information may include the following steps:
First, inputting the data set label into the coding model to generate at least one first label coding information. Wherein the coding model is at least one first convolutional neural network in series. The first convolutional neural network of the at least one first convolutional neural network has a one-to-one correspondence with the first tag coding information of the at least one first tag coding information.
And a second step of determining first tag coding information at a target position in the at least one first tag coding information as target tag coding information. Wherein the target location may be a location where a last layer of the at least one first convolutional neural network in series is located. The target tag encoding information may be encoding information in a matrix form.
And thirdly, carrying out average pooling on the target tag coding information to generate pooled information.
And a fourth step of encrypting each matrix element included in the pooled information by using the encryption model to generate an encryption matrix as the model encryption information.
And fifthly, inputting the model encryption information into the decoding model to generate at least one first tag decoding information. Wherein the decoding model is at least one second convolutional neural network in series. At least one second convolutional neural network has a one-to-one correspondence with the at least one first tag decoding information.
And sixthly, selecting the label decoding information at the second position from the at least one first label decoding information as target label decoding information. Wherein the second location may be a preselected location of the at least one tag decoding information corresponding to the at least one location. For example, the second location may be a location corresponding to second first tag decoding information of the at least one first tag decoding information. The target tag decoding information may be information in a matrix form.
Seventh, combining the second position and the target tag decoding information to generate the encrypted information.
As an example, first, the above-described execution body may perform encoding processing on the second location to generate a matrix of the same matrix dimension as the target tag decoding information as the location matrix. Then, the position matrix and the target tag decoding information are combined to generate a combined matrix as encryption information.
Considering the problems of the conventional solutions described above, facing the two technical problems described above: the accuracy of the value document values obtained is not accurate enough. In combination with the advantages/state of the art it is decided to adopt the following solutions.
In some optional implementations of some embodiments, the performing data encryption processing on each model training data in the model training data set by using a pre-trained encoding and decoding model to generate an encrypted data set, as the encryption information, may include the following steps:
First, executing the following generation steps for each model training data in the model training data set:
And a substep 1, inputting the model training data into the coding and decoding model to generate at least one second tag coding information and at least one second tag decoding information. The second tag coding information in the at least one second tag coding information has a one-to-one correspondence with the first convolutional neural network in the at least one first convolutional neural network. The second tag decoding information in the at least one second tag decoding information has a one-to-one correspondence with a second convolutional neural network in the at least one second convolutional neural network. The coding and decoding model is a network model of an output corresponding business target under a business scene corresponding to the model training data set.
And 2, determining a random number corresponding to the model training data. Wherein the random number is a value in [0, convolution value ]. The convolution value may be a minimum value of an information value corresponding to the at least one second tag encoding information and an information value corresponding to the at least one second tag decoding information.
And 3, determining a second tag coding information of a random number in the at least one second tag coding information as candidate coding information. Next, a second tag decoding information of a random number among the at least one second tag decoding information is determined as candidate decoding information.
And step 4, splicing the candidate coding information, the candidate decoding information and the random number to generate the encrypted data.
And a second step of determining the obtained encrypted data set as the above-mentioned encryption information.
In the above "optional" as an invention point of the present disclosure, the technical problem mentioned in the background art is solved, and the obtained encrypted data is not accurate enough. ". Based on this, the present disclosure, through the randomness of the at least one second tag encoding information and the at least one second tag decoding information, and the random number, the candidate encoding information and the candidate decoding information can be determined so as to accurately generate the encrypted data following the machine number.
Step 105, determining at least one usage model corresponding to the model training data set.
In some embodiments, the execution subject may determine at least one usage model corresponding to the model training data set. The training data set corresponding to the usage model is a model training data set, and the usage model can be a trained neural network model which can be directly used. Each usage model of the at least one usage model has a corresponding model usage scenario. For example, the at least one model usage scenario corresponding to the at least one usage model includes: face recognition scenes, character recognition scenes and limb key points determination scenes.
And 106, carrying out encryption processing on at least one model source file corresponding to the at least one usage model to generate at least one piece of encryption model source file information.
In some embodiments, the execution body may encrypt at least one model source file corresponding to the at least one usage model to generate at least one encrypted model source file information. A usage model of the at least one usage model has a one-to-one correspondence with a model source file of the at least one model source file. The model source file may be a code source file using a model. The model source file may characterize the model structure of the usage model and the corresponding set of model parameters. The encryption model source file information in the at least one encryption model source file information has a one-to-one correspondence with the usage model in the at least one usage model.
As an example, for each usage model, first, the execution subject may determine model usage description information corresponding to the usage model. Wherein the model usage description information includes: the model uses scene, model input, model output. And then, respectively carrying out information coding processing on scene information corresponding to the model using scene, input description information input by the model and output description information output by the model to obtain scene coding information, model input coding information and model output coding information. Finally, the scene coding information, the model input coding information and the model output coding information are combined into a matrix form to generate a combined matrix. Finally, for each row element group of the combination matrix, an averaging process of individual elements in the element group is performed to generate an average matrix as encryption model source file information.
Step 107, setting at least one connection identifier for the model training dataset and the at least one usage model.
In some embodiments, the execution body may set at least one connection identifier for the model training data set and the at least one usage model. Wherein the connection identity may characterize an association between the usage model and the model training dataset. The connection identity may be information in the form of a numerical value. The connection identifiers in the at least one connection identifier have a one-to-one correspondence with the usage model in the at least one usage model. The connection identity is used for subsequent generation of connection identity requests. The connection identification request may be for communication between the first storage center and the second storage center. That is, the connection identification request may obtain a model training dataset associated with the corresponding connection identification in the first storage center. The connection identification request may obtain a usage model in the second storage center associated with the corresponding connection identification.
Step 108, storing the model training data set and the encryption information in a first storage center, and storing the at least one usage model and the at least one encryption model source file information in a second storage center.
In some embodiments, the execution body may store the model training data set and the encryption information in a first storage center, and store the at least one usage model and the at least one encryption model source file information in a second storage center. Wherein the first storage center and the second storage center communicate by a connection identification request. In the practice that the connection identifier request is generated based on the connection identifier, the connection identifier request may be a request generated by filling the connection identifier into a preset request template.
In some alternative implementations of some embodiments, after step 108, the steps further include:
The first step, in response to receiving a model use request, sends the model use request to the second storage center to perform request analysis on the model use request, and obtains first model encryption information and data set requirement information.
Wherein the model use request may be a request to invoke a use model. The first model encryption information may be encryption information of a model source file corresponding to the model to be called. The data set requirement information may characterize whether requirement information of a model training data set corresponding to the model to be called is required. For example, the data set requirement information may include: the representation does not need to acquire the requirement information of the model training data set corresponding to the model to be called, and the representation needs to acquire the requirement information of the model training data set corresponding to the model to be called.
As an example, the executing body may analyze the request for using the model by using a preset first regularization term, to obtain first model encryption information and data set requirement information.
And a second step of matching a corresponding model source file in the second storage center as a first model source file according to the first model encryption information.
As an example, first, the execution subject may acquire encryption model source file information matching the first model encryption information from the second storage center as the first target encryption model source file information. Then, a model source file corresponding to the first target encryption model source file information is determined in the second storage center as the first model source file.
And thirdly, responding to the data set demand information characterization to acquire a corresponding model training data set, and carrying out information analysis on the data set demand information to generate first data set encryption information.
As an example, the executing body may perform information analysis on the data set requirement information by using a preset second regular term to generate the first data set encryption information.
Fourth, a first connection identification request for the first model encryption information and the first data set encryption information is generated. The first connection identifier request may be request information for characterizing acquisition of a model training data set corresponding to the first model encryption information and the first data set encryption information.
As an example, first, a request generation template corresponding to a connection identification request is acquired. Then, the first model encryption information and the first data set encryption information are added to the request generation template to generate a first connection identification request.
And fifthly, sending the first connection identification request to the first storage center to perform request analysis on the first connection identification request so as to acquire encryption information of a first data set. The first data set encryption information may be encryption information obtained by encrypting the corresponding data set.
And sixthly, matching the first model training data set corresponding to the first data set encryption information in the first storage center.
As an example, first target encryption information that matches the first data set encryption information is determined in the first storage center. Then, a model training data set corresponding to the first target encryption information is determined as the first model training data set.
In some alternative implementations of some embodiments, after step 108, the steps further include:
The first step is that the data set use request is sent to the first storage center in response to the received data set use request, so that the data set use request is subjected to request analysis, and second data set encryption information and model requirement information are obtained.
Wherein the data set use request may be a request to invoke use of the data set. The second data set encryption information is to retrieve the encryption information corresponding to the used data set. The model requirement information may be requirement information characterizing whether to obtain a corresponding usage model file. The model requirement information may include: the representation obtains the requirement information corresponding to the usage model file, and the representation does not obtain the requirement information corresponding to the usage model file.
And a second step of matching a corresponding second model training data set in the first storage center according to the second data set encryption information.
As an example, first, second target encryption information corresponding to the second data set encryption information is determined in the first storage center. Then, a second model training data set corresponding to the second target encryption information is determined.
And thirdly, responding to the determination of the characterization of the model demand information to obtain a corresponding use model file, and carrying out information analysis on the model demand information to generate second model encryption information.
As an example, the execution body may perform information analysis on the model requirement information by using a preset third regular term to generate second model encryption information. The second model encryption information may be encryption information of a model source file corresponding to the model to be called.
Fourth, a second connection identifier request is generated for the second data set encryption information and the second model encryption information. Wherein the second connection identification request may be request information characterizing acquisition of a model source file corresponding to the second model encryption information and the second data set encryption information.
As an example, first, a request generation template corresponding to a connection identification request is acquired. Then, the second model encryption information and the second data set encryption information are added to the request generation template to generate a first connection identification request.
And fifthly, sending the second connection identification request to the second storage center to perform request analysis on the second connection identification request, and obtaining the second model encryption information.
And sixthly, matching the model source file corresponding to the second model encryption information in the second storage center to serve as a second model source file.
As an example, first target encryption model source file information that matches the second model encryption information is determined in the second storage center. Then, a model source file corresponding to the first target encryption model source file information is determined as a second model source file.
The above embodiments of the present disclosure have the following advantageous effects: by the model data storage method of some embodiments of the present disclosure, encryption processing of a model training data set and at least one corresponding usage model can be implemented, so that secret protection of the model training data and a model source file is effectively performed, and subsequent use of the training data and the model source file is facilitated. In particular, the reason for the fact that the associated training data and model source file encryption correspond to the privacy protection is not efficient enough and the subsequent use is not efficient enough is that: the related desensitization algorithm logic is simpler, and the common retrieval and use between the desensitized model training data and the desensitized model source file are more troublesome, so that the application efficiency of the model training data and the model source file is lower. Based on this, the model data storage method of some embodiments of the present disclosure first obtains a model training data set as a subsequent encrypted and correspondingly stored data object, facilitating use of the subsequent data set. Then, the data set importance degree information corresponding to the model training data set is determined. Here, the determined data set importance information may be used for the subsequent determination of the encryption scheme. And different data set importance degree information is selected to correspond to the encryption mode of the secrecy degree, so that the adaptive encryption of the resources is realized. Then, according to the importance degree information of the data set, the encryption mode corresponding to the model training data set can be accurately determined, so as to realize corresponding secrecy protection aiming at the importance degree of the model training data set. And then, according to the encryption mode, each model training data in the model training data set can be accurately subjected to data encryption processing so as to generate encryption information, and targeted privacy protection can be realized. Next, at least one usage model corresponding to the model training data set is determined, so that the usage of the subsequent model training data set and the retrieval of the corresponding usage model are facilitated. Further, encryption processing is performed on at least one model source file corresponding to the at least one usage model, so as to accurately generate at least one piece of encryption model source file information, and protection of the model source file is achieved. Further, at least one connection identifier for the model training dataset and the at least one usage model is set to facilitate subsequent usage of the model and retrieval of the model training dataset. Finally, the model training data set and the encryption information are stored in a first storage center, and the at least one usage model and the at least one encryption model source file information are stored in a second storage center. Wherein the first storage center and the second storage center communicate via a connection identification request, the connection identification request being generated based on a connection identification. Here, by means of the differential storage of the first storage center and the second storage center, an efficient storage and steganography of the model training data set and the usage model can be achieved. In addition, through the connection identification request, the model training data set and the selective acquisition of the use model can be realized, and the acquisition efficiency is greatly improved.
With further reference to fig. 2, as an implementation of the method shown in the above figures, the present disclosure provides some embodiments of a model data store, corresponding to those method embodiments shown in fig. 1, which may find particular application in a variety of electronic devices.
As shown in fig. 2, a model data storage device 200 includes: an acquisition unit 201, a first determination unit 202, a second determination unit 203, a data encryption processing unit 204, a third determination unit 205, an encryption unit 206, a setting unit 207, and a storage unit 208. Wherein the obtaining unit 201 is configured to obtain a model training data set; a first determining unit 202 configured to determine data set importance information corresponding to the model training data set; a second determining unit 203 configured to determine an encryption mode corresponding to the model training data set according to the data set importance degree information; a data encryption processing unit 204 configured to perform data encryption processing on each model training data in the model training data set according to the encryption scheme to generate encryption information; a third determining unit 205 configured to determine at least one usage model corresponding to the model training data set; an encryption unit 206 configured to encrypt at least one model source file corresponding to the at least one usage model to generate at least one encrypted model source file information; a setting unit 207 configured to set connection identifications for the model training data set and the at least one usage model; a storage unit 208 configured to store the model training data set and the encryption information in a first storage center and the at least one usage model and the at least one encryption model source file information in a second storage center, wherein the first storage center and the second storage center communicate via a connection identification request, the connection identification request being generated based on a connection identification.
It will be appreciated that the elements described in the model data store 200 correspond to the various steps in the method described with reference to fig. 1. Thus, the operations, features and advantages described above for the method are equally applicable to the model data storage device 200 and the units contained therein, and are not described here again.
Referring now to fig. 3, a schematic diagram of an electronic device (e.g., electronic device) 300 suitable for use in implementing some embodiments of the present disclosure is shown. The electronic device shown in fig. 3 is merely an example and should not impose any limitations on the functionality and scope of use of embodiments of the present disclosure.
As shown in fig. 3, the electronic device 300 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 301 that may perform various suitable actions and processes in accordance with a program stored in a Read Only Memory (ROM) 302 or a program loaded from a storage means 308 into a Random Access Memory (RAM) 303. In the RAM 303, various programs and data required for the operation of the electronic apparatus 300 are also stored. The processing device 301, the ROM 302, and the RAM 303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.
In general, the following devices may be connected to the I/O interface 305: input devices 306 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 307 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 308 including, for example, magnetic tape, hard disk, etc.; and communication means 309. The communication means 309 may allow the electronic device 300 to communicate with other devices wirelessly or by wire to exchange data. While fig. 3 shows an electronic device 300 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead. Each block shown in fig. 3 may represent one device or a plurality of devices as needed.
In particular, according to some embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, some embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via communications device 309, or from storage device 308, or from ROM 302. The above-described functions defined in the methods of some embodiments of the present disclosure are performed when the computer program is executed by the processing means 301.
It should be noted that, in some embodiments of the present disclosure, the computer readable medium may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In some embodiments of the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In some embodiments of the present disclosure, however, the computer-readable signal medium may comprise a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
In some embodiments, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring a model training data set; determining importance degree information of a data set corresponding to the model training data set; determining an encryption mode corresponding to the model training data set according to the importance degree information of the data set; according to the encryption mode, carrying out data encryption processing on each model training data in the model training data set to generate encryption information; determining at least one usage model corresponding to the model training data set; encrypting at least one model source file corresponding to the at least one usage model to generate at least one encrypted model source file information; setting at least one connection identifier for the model training dataset and the at least one usage model; storing the model training data set and the encryption information in a first storage center, and storing the at least one usage model and the at least one encryption model source file information in a second storage center, wherein the first storage center and the second storage center communicate via a connection identification request, the connection identification request being generated based on a connection identification.
Computer program code for carrying out operations for some embodiments of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in some embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The described units may also be provided in a processor, for example, described as: a processor includes an acquisition unit, a first determination unit, a second determination unit, a data encryption processing unit, a third determination unit, an encryption unit, a setting unit, and a storage unit. Where the names of the units do not constitute a limitation of the unit itself in some cases, for example, the acquisition unit may also be described as "unit acquiring a model training dataset".
The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above technical features, but encompasses other technical features formed by any combination of the above technical features or their equivalents without departing from the spirit of the invention. Such as the above-described features, are mutually substituted with (but not limited to) the features having similar functions disclosed in the embodiments of the present disclosure.

Claims (9)

1. A model data storage method, comprising:
Acquiring a model training data set;
determining importance degree information of a data set corresponding to the model training data set;
Determining an encryption mode corresponding to the model training data set according to the importance degree information of the data set;
according to the encryption mode, carrying out data encryption processing on each model training data in the model training data set to generate encryption information;
Determining at least one usage model corresponding to the model training data set;
encrypting at least one model source file corresponding to the at least one usage model to generate at least one encrypted model source file information;
setting at least one connection identification for the model training dataset and the at least one usage model;
storing the model training data set and the encryption information in a first storage center, and storing the at least one usage model and the at least one encryption model source file information in a second storage center, wherein the first storage center and the second storage center communicate via a connection identification request, the connection identification request being generated based on a connection identification.
2. The method of claim 1, wherein the performing data encryption processing on each model training data in the model training data set according to the encryption manner to generate encryption information comprises:
And in response to determining that the encryption mode is an encryption mode based on a data set label, encrypting the data set label corresponding to the model training data set by utilizing a pre-trained coding and decoding model to generate the encryption information, wherein the data set label comprises: and at least one keyword corresponding to the model training data set.
3. The method of claim 1, wherein the performing data encryption processing on each model training data in the model training data set according to the encryption manner to generate encryption information comprises:
and in response to determining that the encryption mode is a data content encryption mode, performing data encryption processing on each model training data in the model training data set by utilizing a pre-trained coding and decoding model so as to generate an encryption data set as the encryption information.
4. The method of claim 1, wherein the method further comprises:
Responding to a received model use request, and sending the model use request to the second storage center so as to analyze the model use request and acquire first model encryption information and data set demand information;
according to the first model encryption information, matching a corresponding model source file in the second storage center to serve as a first model source file;
Responding to the data set demand information characterization to obtain a corresponding model training data set, and carrying out information analysis on the data set demand information to generate first data set encryption information;
Generating a first connection identification request for the first model encryption information and the first data set encryption information;
The first connection identification request is sent to the first storage center so as to carry out request analysis on the first connection identification request and obtain encryption information of a first data set;
And matching a first model training data set corresponding to the first data set encryption information in the first storage center.
5. The method of claim 1, wherein the method further comprises:
In response to receiving a data set use request, sending the data set use request to the first storage center to perform request analysis on the data set use request, and acquiring second data set encryption information and model requirement information;
matching a corresponding second model training data set in the first storage center according to the second data set encryption information;
responding to the determination of the model demand information to represent and acquire a corresponding use model file, and carrying out information analysis on the model demand information to generate second model encryption information;
generating a second connection identification request for the second data set encryption information and the second model encryption information;
the second connection identification request is sent to the second storage center so as to carry out request analysis on the second connection identification request, and the second model encryption information is obtained;
And matching a model source file corresponding to the second model encryption information in the second storage center to serve as a second model source file.
6. The method of claim 1, wherein the encoding and decoding model comprises: an encoding model, an encrypting model based on hash encryption and a decoding model; and
The encrypting the data set label corresponding to the model training data set by utilizing a pre-trained encoding and decoding model to generate the encryption information comprises the following steps:
Inputting the data set label into the coding model to generate at least one first label coding information, wherein the coding model is at least one first convolution neural network connected in series, and the at least one first convolution neural network has a one-to-one correspondence with the at least one first label coding information;
determining first tag coding information at a target position in the at least one first tag coding information as target tag coding information;
carrying out average pooling on the target tag coding information to generate pooled information;
Encrypting each matrix element included in the pooled information by using the encryption model to generate an encryption matrix as the model encryption information;
inputting the model encryption information into the decoding model to generate at least one first tag decoding information, wherein the decoding model is at least one second convolution neural network connected in series, and the at least one second convolution neural network has a one-to-one correspondence with the at least one first tag decoding information;
Selecting the label decoding information of the second position from the at least one first label decoding information as target label decoding information;
And combining the second position and the target tag decoding information to generate the encryption information.
7. A model data storage device, comprising:
an acquisition unit configured to acquire a model training dataset;
a first determining unit configured to determine data set importance degree information corresponding to the model training data set;
The second determining unit is configured to determine an encryption mode corresponding to the model training data set according to the importance degree information of the data set;
The data encryption processing unit is configured to perform data encryption processing on each model training data in the model training data set according to the encryption mode so as to generate encryption information;
a third determining unit configured to determine at least one usage model corresponding to the model training data set;
An encryption unit configured to encrypt at least one model source file corresponding to the at least one usage model to generate at least one encrypted model source file information;
a setting unit configured to set connection identifications for the model training dataset and the at least one usage model;
A storage unit configured to store the model training dataset and the encryption information in a first storage center and the at least one usage model and the at least one encryption model source file information in a second storage center, wherein the first storage center and the second storage center communicate via a connection identification request, the connection identification request being generated based on a connection identification.
8. An electronic device, comprising:
One or more processors;
a storage device having one or more programs stored thereon,
When executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-6.
9. A computer readable medium having stored thereon a computer program, wherein the program when executed by a processor implements the method of any of claims 1-6.
CN202410442426.6A 2024-04-12 2024-04-12 Model data storage method, apparatus, electronic device, and computer-readable medium Pending CN118171303A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410442426.6A CN118171303A (en) 2024-04-12 2024-04-12 Model data storage method, apparatus, electronic device, and computer-readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410442426.6A CN118171303A (en) 2024-04-12 2024-04-12 Model data storage method, apparatus, electronic device, and computer-readable medium

Publications (1)

Publication Number Publication Date
CN118171303A true CN118171303A (en) 2024-06-11

Family

ID=91352958

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410442426.6A Pending CN118171303A (en) 2024-04-12 2024-04-12 Model data storage method, apparatus, electronic device, and computer-readable medium

Country Status (1)

Country Link
CN (1) CN118171303A (en)

Similar Documents

Publication Publication Date Title
CN110391895B (en) Data preprocessing method, ciphertext data acquisition method, device and electronic equipment
CN112434620B (en) Scene text recognition method, device, equipment and computer readable medium
CN111259409A (en) Information encryption method and device, electronic equipment and storage medium
CN111629063A (en) Block chain based distributed file downloading method and electronic equipment
CN112182109A (en) Distributed data coding storage method based on block chain and electronic equipment
CN114881247A (en) Longitudinal federal feature derivation method, device and medium based on privacy computation
CN113259353A (en) Information processing method and device and electronic equipment
CN112329044A (en) Information acquisition method and device, electronic equipment and computer readable medium
US20230418794A1 (en) Data processing method, and non-transitory medium and electronic device
CN116361121A (en) Abnormal interface alarm method, device, electronic equipment and computer readable medium
CN118171303A (en) Model data storage method, apparatus, electronic device, and computer-readable medium
CN112507676B (en) Method and device for generating energy report, electronic equipment and computer readable medium
CN110781523B (en) Method and apparatus for processing information
CN114491421A (en) File encryption method, file processing method, file encryption device, file processing device, readable medium and electronic equipment
CN110492998B (en) Method for encrypting and decrypting data
CN111580890A (en) Method, apparatus, electronic device, and computer-readable medium for processing features
CN112468470B (en) Data transmission method and device and electronic equipment
CN117633848B (en) User information joint processing method, device, equipment and computer readable medium
CN111404890B (en) Flow data detection method, system, storage medium and electronic device
CN116260530B (en) Information transmission method, apparatus, electronic device, and computer readable medium
CN110619218B (en) Method and apparatus for generating information
CN117150455A (en) Power information digital watermark encryption authentication method, electronic equipment and computer medium
CN116453197A (en) Face recognition method, device, electronic equipment and computer readable medium
CN111835846A (en) Information updating method and device and electronic equipment
CN113807530A (en) Information processing system, method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination