CN117892356A

CN117892356A - Water quality data processing method and device, electronic equipment and storage medium

Info

Publication number: CN117892356A
Application number: CN202410294002.XA
Authority: CN
Inventors: 王宇昊; 魏松瑞; 张晗; 邓晨旭
Original assignee: Shenzhen University
Current assignee: Shenzhen University
Priority date: 2024-03-14
Filing date: 2024-03-14
Publication date: 2024-04-16
Anticipated expiration: 2044-03-14
Also published as: CN117892356B

Abstract

The application provides a water quality data processing method and device, electronic equipment and a storage medium, which belong to the technical field of water conservancy monitoring, the method comprises the steps of obtaining initial water quality data, generating noise to the sensitive data through a preset noise generation network to obtain pseudo data, encrypting the initial water quality data and the pseudo data to obtain encrypted water quality data, signing the encrypted water quality data, determining a target signature, checking the target signature, storing the encrypted water quality data passing through the check signature to a blockchain, responding to a transaction request to the blockchain, decrypting the encrypted water quality data according to the transaction request to obtain decrypted water quality data, wherein the decrypted water quality data comprises a data identification tag, the data identification tag is used for distinguishing the sensitive data and the pseudo data, and the aim of guaranteeing the data security of the water quality data is achieved.

Description

Water quality data processing method and device, electronic equipment and storage medium

Technical Field

The application relates to the technical field of water conservancy monitoring, in particular to a water quality data processing method and device, electronic equipment and a storage medium.

Background

Water quality detection is one of the important means for ensuring the safety of water resources. The water quality data obtained by water quality detection contains sensitive information such as object position information, time information, ammonia nitrogen content, pollutant sources and the like, and in order to avoid leakage of the sensitive information, data privacy protection needs to be carried out on the sensitive information in the water quality data so as to ensure the data security of the water quality data.

Disclosure of Invention

The embodiment of the application mainly aims to provide a water quality data processing method, a water quality data processing device, electronic equipment and a storage medium, and aims to ensure the data safety of water quality data and realize the privacy protection of the water quality data.

To achieve the above object, a first aspect of an embodiment of the present application provides a water quality data processing method, including:

acquiring initial water quality data; the initial water quality data includes sensitive data;

noise generation is carried out on the sensitive data through a preset noise generation network, so that pseudo data are obtained;

encrypting the initial water quality data and the pseudo data to obtain encrypted water quality data;

signing the encrypted water quality data to determine a target signature;

verifying the target signature, and storing the encrypted water quality data passing through verification to a blockchain;

Responding to the transaction request of the block chain, decrypting the encrypted water quality data according to the transaction request to obtain decrypted water quality data; the decrypted water quality data includes a data identification tag; the data identification tag is used to distinguish between the sensitive data and the dummy data.

In some embodiments, the preset noise generation network is trained according to the following steps:

acquiring sample water quality data and randomly generating a noise sequence;

obtaining a guide vector;

generating data of the sample water quality data, the guide vector and the noise sequence through a preset generator to obtain pseudo water quality data;

performing Monte Carlo sampling on the pseudo water quality data to obtain target water quality data;

calculating word shift distance and Wasserstein distance between the sample water quality data and the target water quality data;

carrying out data discrimination on the target water quality data through a preset discriminator to obtain discrimination information; the discrimination information is used for representing the similarity degree between the target water quality data and the sample water quality data;

performing model parameter adjustment on the preset generator according to the word shift distance, the Wasserstein distance and the discrimination information to obtain a preliminary generator;

Calculating to generate countermeasures loss data;

and adjusting model parameters of the primary generator and the discriminator according to the generated countermeasures loss data to obtain the preset noise generation network.

In some embodiments, the acquiring a steering vector includes:

acquiring the trained times of the preset noise generation network;

if the trained times are zero, randomly generating the guide vector;

and if the trained times are greater than zero, acquiring the target water quality data generated by the preset generator in the last training, and extracting the characteristics of the sample water quality data and the target water quality data to obtain the guide vector.

In some embodiments, the computing generates counterdamage data, including:

judging the sample water quality data through the preset judging device to obtain judging loss; carrying out loss calculation according to the discrimination information and the discrimination loss to obtain the generated counterloss data;

or,

carrying out data discrimination on the sample water quality data through the preset discriminator to obtain discrimination loss; generating data according to the sample water quality data and the target water quality data to obtain reference water quality data; carrying out data discrimination on the reference water quality data through the preset discriminator to obtain gradient punishment data; and carrying out loss calculation according to the discrimination information, the discrimination loss and the gradient penalty data to obtain the generated counterloss data.

In some embodiments, encrypting the initial water quality data and the dummy data to obtain encrypted water quality data includes:

acquiring a first prime number and a second prime number;

determining a key length according to the first prime number and the second prime number;

obtaining a target integer; the target integer is larger than one and smaller than a preset integer threshold, and the preset integer threshold is obtained according to the first prime number and the second prime number; the target integer and the preset integer threshold are prime numbers;

generating a first public key according to the key length and the target integer;

encrypting the initial water quality data and the pseudo data through the first public key to obtain the encrypted water quality data;

decrypting the encrypted water quality data according to the transaction request to obtain decrypted water quality data, including:

generating a first private key according to the target integer, the preset integer threshold and the key length;

and decrypting the encrypted water quality data through the first private key according to the transaction request to obtain the decrypted water quality data.

In some embodiments, the encrypting the initial water quality data and the dummy data by the first public key to obtain the encrypted water quality data includes:

Encoding the initial water quality data and the pseudo data to obtain an encoded water quality data stream;

grouping the coded water quality data stream according to the key length to obtain a plurality of water quality grouping data;

performing data conversion on each water quality grouping data according to the key length to obtain candidate water quality data;

and encrypting the candidate water quality data through the first public key to obtain the encrypted water quality data.

In some embodiments, the signing the encrypted water quality data, determining a target signature, comprises:

obtaining public parameters and a second private key;

performing signature calculation according to the second private key, the encrypted water quality data and the public parameters to obtain a primary signature;

performing signature aggregation on all the primary signatures to obtain the target signature;

the public parameter includes a verification key, and the signing the target signature includes:

performing signature decomposition on the target signature to obtain a plurality of preliminary signatures;

and verifying each primary signature according to the verification key.

To achieve the above object, a second aspect of the embodiments of the present application proposes a water quality data processing apparatus, the apparatus comprising:

The acquisition module is used for acquiring initial water quality data; the initial water quality data includes sensitive data;

the noise generation module is used for generating noise on the sensitive data through a preset noise generation network to obtain pseudo data;

the encryption module is used for encrypting the initial water quality data and the pseudo data to obtain encrypted water quality data;

the signature module is used for signing the encrypted water quality data and determining a target signature;

the signature verification module is used for verifying the target signature and storing the encrypted water quality data passing through the signature verification into a block chain;

the decryption module is used for responding to the transaction request of the block chain, decrypting the encrypted water quality data according to the transaction request and obtaining decrypted water quality data; the decrypted water quality data includes a data identification tag; the data identification tag is used to distinguish between the sensitive data and the dummy data.

To achieve the above object, a third aspect of the embodiments of the present application provides an electronic device, where the electronic device includes a memory and a processor, the memory stores a computer program, and the processor executes the computer program to implement the water quality data processing method of the first aspect.

To achieve the above object, a fourth aspect of the embodiments of the present application proposes a computer-readable storage medium storing a computer program, which when executed by a processor, implements the water quality data processing method of the first aspect.

According to the water quality data processing method, the water quality data processing device, the electronic equipment and the computer readable storage medium, initial water quality data are obtained, the initial water quality data comprise sensitive data, the sensitive data are subjected to noise generation through the preset noise generation network, and pseudo data are obtained, so that the pseudo data are utilized to blur the sensitive data, an attack object cannot distinguish real data from pseudo data, the attack of the attack object is effectively resisted, and the safety of the sensitive data is ensured. In order to enhance the safety of the water quality data, the initial water quality data and the dummy data are further encrypted to obtain encrypted water quality data, instead of only encrypting the initial water quality data or the dummy data, so as to blur the dummy data and the sensitive data, and reduce the identification capability of an attack object on the sensitive data. In order to ensure the integrity of the encrypted water quality data and avoid malicious modification of the data, the encrypted water quality data is signed to determine a target signature, the target signature is checked, the encrypted water quality data passing through the check is stored in a blockchain, and the accuracy and the authenticity of the data on the chain are ensured while the data privacy is protected. In response to the transaction request of the blockchain, the encrypted water quality data is decrypted according to the transaction request so as to avoid on-chain attack, so that objects meeting the requirements can obtain the decrypted water quality data, and the sensitive data is checked, thereby ensuring the safety of the water quality data and realizing the privacy protection of the water quality data.

Drawings

FIG. 1 is a flow chart of a water quality data processing method provided in an embodiment of the present application;

FIG. 2 is a flowchart of a training process of a preset noise generation network provided in an embodiment of the present application;

fig. 3 is a flowchart of step S220 in fig. 2;

fig. 4 is a flowchart of step S280 in fig. 2;

fig. 5 is a flowchart of step S130 and step S160 in fig. 1;

fig. 6 is a flowchart of step S550 in fig. 5;

fig. 7 is a flowchart of step S140 and step S150 in fig. 1;

FIG. 8 is a schematic diagram of a water quality data processing apparatus according to an embodiment of the present disclosure;

fig. 9 is a schematic hardware structure of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

It should be noted that although functional block division is performed in a device diagram and a logic sequence is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the block division in the device, or in the flowchart. The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the present application.

Based on the above, the embodiment of the application provides a water quality data processing method, a water quality data processing device, electronic equipment and a computer readable storage medium, which aim to ensure the data security of water quality data and realize the privacy protection of the water quality data.

The water quality data processing method, the water quality data processing device, the electronic equipment and the computer readable storage medium provided in the embodiments of the present application are specifically described by the following embodiments, and the water quality data processing method in the embodiments of the present application is first described.

The embodiment of the application provides a water quality data processing method, which relates to the technical field of water conservancy monitoring. The water quality data processing method provided by the embodiment of the application can be applied to a terminal, a server side and software running in the terminal or the server side. In some embodiments, the terminal may be a smart phone, tablet, notebook, desktop, etc.; the server side can be configured as an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligent platforms and the like; the software may be an application or the like for realizing the water quality data processing method, but is not limited to the above form.

The subject application is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

Fig. 1 is an optional flowchart of a water quality data processing method provided in an embodiment of the present application, where the method in fig. 1 may include, but is not limited to, steps S110 to S160:

step S110, obtaining initial water quality data; the initial water quality data includes sensitive data;

step S120, noise generation is carried out on sensitive data through a preset noise generation network, and pseudo data are obtained;

step S130, encrypting the initial water quality data and the pseudo data to obtain encrypted water quality data;

step S140, signing the encrypted water quality data to determine a target signature;

step S150, checking the target signature, and storing the encrypted water quality data passing through the checking signature into a blockchain;

step S160, in response to the transaction request of the block chain, decrypting the encrypted water quality data according to the transaction request to obtain decrypted water quality data; decrypting the water quality data includes a data identification tag; the data identification tag is used to distinguish between sensitive data and dummy data.

In step S110 of some embodiments, sampling points are selected, and various water quality parameters in the water treatment process are collected by a sensor installed in the water treatment apparatus, so as to obtain initial water quality data. The initial water quality data comprises water quality index, water flow, PH value, chemical Oxygen Demand (COD) value and the like, wherein the water quality index represents the number of the types of impurities in water and is a specific measurement scale for judging the water pollution degree. The sensitive data is a target water quality text which needs privacy protection and can be defined according to actual conditions. The sensitive data can be personal key information of the object, such as position information, time information, ammonia nitrogen content of sewage and the like, and most of the information is numerical data.

The method has the advantages that the requirement of data analysis is met, the initial water quality data is required to be subjected to data cleaning so as to clean abnormal values and noise, and the initial water quality data is converted into a format convenient to process through regularization.

In step S120 of some embodiments, in order to hide the sensitive data, a differential privacy technique is selected to protect the privacy of the sensitive data. The preset noise generation network is trained by real sample water quality data, and in order to avoid reversely deducing training data according to network output results to obtain sensitive data, the embodiment of the application performs deduction according to possible privacy attack types, such as member deduction attack, model inversion attack and the like, and the preset noise generation network based on machine learning is designed and used for protecting water quality parameters hidden in object personal information. For differential privacy, the greater the noise added to the data, the higher the privacy protection level, but the reduced the data practicality, and the generation of noise can be well assisted to find a balance point by using the generation countermeasure network, so that the noise which can protect the privacy and keep the data practicability as much as possible is generated. The preset noise generation network of the embodiment of the application comprises a generator, a discriminator, a personality module and a semantic module. The generator is used for learning the distribution of the sensitive data and generating noise samples (false data) similar to the real data. The noise generation can be carried out on the sensitive data through the generator to obtain pseudo data, wherein the pseudo data is a noise sample which is generated by the generator and has higher similarity with the sensitive data. The dummy data is used to confuse an attacker, making it difficult to determine which data are sensitive data and which data are noisy data. The discriminator, the personality module and the semantic module are all used for distinguishing real data from false data, and the generation of noise is continuously optimized through the interaction among the generator, the discriminator, the personality module and the semantic module, so that the generated noise has an interference effect on privacy attack, and meanwhile, the method has certain universality to cope with unknown or future attack methods.

Referring to fig. 2, in some embodiments, the training process of the preset noise generation network may include, but is not limited to, steps S210 to S290:

step S210, acquiring sample water quality data and randomly generating a noise sequence;

step S220, obtaining a guide vector;

step S230, carrying out data generation on sample water quality data, guide vectors and noise sequences through a preset generator to obtain pseudo water quality data;

step S240, monte Carlo sampling is carried out on the pseudo water quality data to obtain target water quality data;

step S250, calculating word shift distance and Wasserstein distance between the sample water quality data and the target water quality data;

step S260, carrying out data discrimination on the target water quality data through a preset discriminator to obtain discrimination information; the discrimination information is used for representing the similarity degree between the target water quality data and the sample water quality data;

step S270, carrying out model parameter adjustment on a preset generator according to the word shift distance, the Wasserstein distance and the discrimination information to obtain a preliminary generator;

step S280, calculating and generating countermeasures loss data;

and step S290, adjusting model parameters of the primary generator and the discriminator according to the generated countermeasures loss data to obtain a preset noise generation network.

In step S210 of some embodiments, the sample water quality data is real water quality data, which can be obtained by a sensor in the water treatment apparatus. The noise sequence is one of the key factors in generating a diversity of noise samples, the noise sequence being randomly sampled from a predefined probability distribution, which may be a gaussian distribution. The noise sequence is an incomplete sequence, i.e., an incomplete sequence.

In step S220 of some embodiments, when the generator is optimized by using only scalar values such as noise sequences, the pilot signal is weaker, and if the generator is guided to learn sufficiently without an intermediate module, the generator cannot effectively optimize the network parameters. Therefore, a semantic guidance module is added in the generator, and guidance vectors are acquired through the semantic guidance module so as to obtain richer data classification, so that the generation process of pseudo data is guided, and the consumption of computing resources is reduced.

In step S230 of some embodiments, extracting water quality characteristics from the sample water quality data to obtain sample water quality characteristics; extracting noise characteristics of the noise sequence to obtain noise characteristics; and extracting guide features of the guide vector to obtain guide features. And generating data of the sample water quality characteristics, the noise characteristics and the guide characteristics through a preset generator to obtain pseudo water quality data. The pseudo water quality data is pseudo data similar to the sample water quality data generated by the preset generator.

The water quality characteristic extraction process comprises the following steps: performing text vectorization on the sample water quality data to obtain a sample water quality embedded vector; allocating a position coding vector to each word in the sample water quality data, wherein the position coding vector can be obtained through relative position coding or absolute position coding; vector addition is carried out on the sample water quality embedding vector and the position coding vector to obtain a fusion water quality vector; performing self-attention coding on the fusion water quality vector through a self-attention layer to extract semantic features of sample water quality data so as to obtain a first attention vector; performing feedforward feature extraction on the attention vector through a feedforward neural network layer to obtain a second attention vector; normalizing the second attention vector through a normalization layer to obtain a normalized vector, wherein the normalization adopts layer normalization; and performing attention coding on the normalized vector through an attention layer to obtain the water quality characteristics of the sample.

The preset generator includes Long Short-Term Memory (LSTM) and softmax layers. The noise feature extraction process is as follows: vectorizing the noise sequence to obtain a noise vector; and extracting the characteristics of the noise vector through a long-short-time memory network to obtain the noise characteristics. The noise characteristic is used for representing the intermediate state in the network reinforcement learning process, namely the state of partial data vector in the false data generation process. For example, the word string needs to be generated, which is a class of incomplete intermediate states.

The semantic guidance module comprises a long and short term memory network and a linear transformation layer (linear transformation) for distinguishing water quality information, such as time information, pollutant content, etc., under different conditions. And the semantic guidance module is the same as the preset generator, and a long-short-time memory network is also adopted by the semantic guidance module so as to keep the characteristic structures of the guidance characteristics and the noise characteristics consistent. The process of guiding feature extraction is as follows: and extracting features of the guide vector through a long-short-time memory network to obtain a semantic guide vector, and carrying out linear transformation on the semantic guide vector through a linear transformation layer to obtain guide features.

And carrying out characteristic addition on the water quality characteristics, the noise characteristics and the guide characteristics of the sample, and carrying out activation processing on the added characteristics through a softmax layer to obtain pseudo water quality data.

In step S240 of some embodiments, the arbiter can only identify the complete data sequence, so that repeated sampling by means of monte carlo sampling is required to obtain the complete data sequence, and meanwhile, the generation method of the random number is adopted to increase the diversity of the generation process. And performing Monte Carlo sampling on the pseudo water quality data to obtain complete target water quality data. The method comprises the following specific steps: 1. and generating next data w according to the incomplete sequence generated at present by using the strategy model probability p. 2. Splicing w to the current incomplete sequence to obtain a new incomplete sequence. 3. Steps 1 and 2 are repeated k times (assuming k is 4) to obtain k incomplete sequences. 4. The discrimination probabilities of the k sequences or the similarity with the sample water quality data are repeated as feedback of the current data w. 5. And the strategy model is updated according to the feedback, and the generation of the next data is carried out until the sequence generation is finished.

In step S250 of some embodiments, the sequence completed by monte carlo sampling, i.e., the target water quality data, will also be input to the personality module and the semantic module, respectively, to calculate the personality distance and the semantic distance between the real sentence and the generated sentence. The calculation complexity is reduced while the accuracy of distance calculation is ensured, the personalized distance of the personalized module is comprehensively considered, the Wasserstein distance is used as a reference index for measuring the deviation between the generated data and the real data, the semantic distance of the semantic module is used as a reference index for measuring the deviation between the generated data and the real data, namely the word shift distance (Word Move Distance, WMD), and the correlation of the distribution between the data generated by the generator and the real data is compared. The calculating method of the semantic distance is shown in formula (1).

Formula (1)

Wherein,representing, for the semantic distance, the connection weight between the generated data (target water quality data) and the real data (sample water quality data); wmd is a data offset; the data_length represents the data sequence length, can generate the sequence length of data or the sequence length of real data, and the generated data and the real data keep the same length; df represents the generated data, dr represents the real data.

The overall idea of WMD is to calculate the cost of converting text X into text Y, i.e. the computational complexity or data offset. The calculation method of the data offset is shown in formula (2).

Formula (2)

Wherein X and Y represent two texts,represents the +.>Individual words +.>The connection weight between individual words; />The expression->And->The word vector distance between the two can be defined by Euclidean distance; len (X) represents the length of text X.

The personality distance is measured by Wasserstein distance, and the specific calculation method of the personality distance is shown in a formula (3).

Formula (3)

Wherein D represents a discriminator;xrepresenting sample water quality data;zis a noise sequence;is true data distribution; g represents a generator; />Is the noise distribution; />Is the target water quality data; e represents the desire.

In step S260 of some embodiments, the preset generator takes a random number (noise sequence) as a starting point of the water quality data generation task and gradually generates each data in the pseudo water quality data sequence using LSTM. And semantic similarity between the generator generated data and the real data is improved by means of semantic features extracted by a self-attention mechanism in the real sequence. In generating the contrast framework, the authenticity of the generated sequence needs to be improved by means of a discriminant, which is trained using negative samples generated by the generator and positive samples in the training dataset, which is essentially a binary classification task. Inputting data (target water quality data) and real data (sample water quality data) of a preset generator into a preset discriminator by means of a Monte Carlo sampling complement sequence to carry out data discrimination, thereby judging the authenticity of the generated sequence, and outputting the probability of whether the generated sequence is the real data.

The preset discriminant employs a convolutional neural network (Convolutional Neural Networks, CNN) through which contextual features of the text sequence are extracted. And carrying out vectorization on the target water quality data through the data embedding layer to obtain a data vector, inputting the data vector into the convolution layer, and extracting text features by using convolution kernels with different sizes to obtain the text features. Text features are transferred to the output layer after full connection layer processing, and the output layer adopts a softmax function. And processing the output of the output layer through the linear layer and the sigmoid activation layer to obtain the discrimination information. The discrimination information is used for representing the similarity degree between the target water quality data and the sample water quality data, namely the probability that the target water quality data output by the discriminator is real data.

In step S270 of some embodiments, the adjustment of the generator is divided into three feedbacks, personality distance, semantic distance, and discriminant information, respectively. The personality distance is the distance between the generated data and the real data obtained by taking the wasperstein distance as a reference index. The personality module outputs a lower personality distance, which indicates that the generated data can capture the real data distribution characteristics, and the smaller the personality distance, the larger the reward signal. The discrimination information is the probability that the target water quality data output by the discriminator is true, and the larger the probability is, the larger the reward signal is. The semantic distance is the distance between the generated data and the real data obtained by taking the data offset as a reference index. The semantic module outputs a larger semantic distance, the larger the reward signal. And feeding back the weighted average value of the personality distance, the semantic distance and the discrimination information to the generator as a reinforcement learning reward signal to guide the learning of the generator, and returning to the intermediate state action step by using Monte Carlo search until a complete sequence is generated. The generator employs a strategic gradient to maximize the feedback bonus signal. And taking the weighted average value of the personality distance, the semantic distance and the discrimination information as the reward data, and maximizing the reward data to perform model parameter adjustment on a preset generator so as to improve the accuracy of the data generated by the generator and obtain a preliminary generator. If the personality distance is denoted as a, the weight of the personality distance is denoted as w1, the semantic distance is denoted as b, the weight of the semantic distance is denoted as w2, the discrimination information is denoted as c, the weight of the discrimination information is denoted as w3, the weighted average is denoted as 1/3 (a×w1+b×w2+c×w3), and the multiplication operation is denoted.

In step S280 of some embodiments, to further improve the generation capability of the generator and the discrimination capability of the discriminator, the generation of the counterloss data is calculated.

In step S290 of some embodiments, the generation of counterloss data is minimized, model parameters of the preliminary generator and the arbiter are adjusted, the generation capacity and the discrimination capacity of the generator and the arbiter are continuously improved in the process of the maximum and minimum games, and finally a nash balance is achieved, so as to obtain the preset noise generation network.

In the above steps S210 to S290, the model parameters of the generator are primarily adjusted by using the bonus data as feedback, the model parameters of the generator are adjusted again by generating the countermeasures loss data, and a more optimal generator can be determined by two adjustments, thereby improving the authenticity of the generated data. Model parameters of the discriminators are adjusted through generating counterloss data, so that the generating capacity and the discriminating capacity of the discriminators are continuously improved in the game process, and noise which can protect privacy and keep data practicality is generated.

Referring to fig. 3, in some embodiments, step S220 may include, but is not limited to, steps S310 to S330:

Step S310, obtaining the trained times of a preset noise generation network;

step S320, if the trained times are zero, randomly generating a guide vector;

and step S330, if the training times are greater than zero, acquiring target water quality data generated by a preset generator in the last training, and extracting features of the sample water quality data and the target water quality data to obtain a guide vector.

In step S310 of some embodiments, the trained times are times that the preset noise generation network has been trained. The trained times are non-negative integers.

In step S320 of some embodiments, if the number of trained times is zero, which indicates that the first training of the preset noise generation network is currently required, the pilot vector is randomly generated.

In step S330 of some embodiments, if the number of training times is greater than zero, the target water quality data generated by the preset generator for the last training of the current training round is obtained, and the CNN of the discriminator is used to extract advanced text features from the target water quality data and the sample water quality data obtained by the last training, so as to obtain a guidance vector. The guide vectors are processed by a feature guide module and transmitted to a generator to enhance the text generation process.

Through the above steps S310 to S330, a guidance vector can be obtained to obtain a richer data classification from the arbiter, so as to guide the data generation process based on the guidance vector, and improve the authenticity of the data generation.

Referring to fig. 4, in some embodiments, step S280 may include, but is not limited to, including step S410 or step S420:

step S410, discriminating the sample water quality data through a preset discriminator to obtain discrimination loss; performing loss calculation according to the discrimination information and the discrimination loss to obtain generated counterloss data;

step S420, carrying out data discrimination on the sample water quality data through a preset discriminator to obtain discrimination loss; generating data according to the sample water quality data and the target water quality data to obtain reference water quality data; carrying out data discrimination on the reference water quality data through a preset discriminator to obtain gradient punishment data; and carrying out loss calculation according to the discrimination information, the discrimination loss and the gradient penalty data to obtain the generated counterloss data.

In step S410 of some embodiments, the sample water quality data is processed by a preset arbiter DxData discrimination is carried out to obtain discrimination loss of D #x). The discrimination information is expressed as D (G (z)), and the discrimination loss D is calculated based on the discrimination information D (G (z)) x) And performing loss calculation to obtain generation counterloss data. The calculation method for generating the countermeasures loss data is shown in formula (4).

Formula (4)

In step S420 of some embodiments, in order to solve the problem that the training process and the result are uncontrollable due to the too free learning model of the original generated countermeasure, the degree of difference between the generated distribution and the actual distribution is measured more accurately by using the wasperstein distance instead of the JS divergence, and meanwhile, gradient penalty is applied, so that the problem of training stability is further solved, and the optimization effect is achieved. The water quality data of the sample is obtained by a preset discriminator DxData discrimination is carried out to obtain discrimination loss of D #x). Based on compliance [0,1]Uniformly distributed random numbersAccording to the water quality data of the samplexGenerating data with the target water quality data G (z) to obtain reference water quality data +.>Is->. Reference water quality data are subjected to +.>Performing data discrimination to obtain gradient penalty data of +.>，/>Representation->Is a gradient of (a). The regularization term taking the gradient penalty data as the loss function can accelerate the convergence speed of the model, and is obviously superior to the traditional generation countermeasure network. The loss calculation is performed according to the discrimination information to obtain a generated loss, the loss calculation is performed according to the discrimination information, the discrimination loss and the gradient penalty data to obtain a discrimination loss, and the generated loss and the discrimination loss are used as generated counterloss data. Model parameters of the discriminator are adjusted by using the generated loss, and model parameters of the generator are adjusted by using the discriminated loss. The loss function of the generator (generation loss) is shown in formula (5), and the loss function of the discriminator (discrimination loss) is shown in formula (6).

Formula (5)

Formula (6)

Wherein,for the weight parameters, w represents the model parameters of the arbiter, +.>Model parameters representing the generator.

Through the above steps S410 to S420, it is possible to determine generation of the counterdamage data to adjust the model parameters based on the generation of the counterdamage data, and generate a confusable noise signal to protect the sensitive data from information leakage.

In order to further protect sensitive data and ensure the privacy safety of the data, the data formed by packaging the initial water quality data and the pseudo data is encrypted through an encryption algorithm, so that encrypted water quality data is obtained. The encryption algorithm may be an RSA asymmetric encryption algorithm, which generates a public key and a private key, encrypts data using the public key, and decrypts the encrypted data using the private key.

Referring to fig. 5, in some embodiments, step S130 may include, but is not limited to, steps S510 to S550:

step S510, obtaining a first prime number and a second prime number;

step S520, determining the key length according to the first prime number and the second prime number;

step S530, obtaining a target integer; the target integer is larger than one and smaller than a preset integer threshold, and the preset integer threshold is obtained according to the first prime number and the second prime number; the target integer and the preset integer threshold are prime numbers;

Step S540, a first public key is generated according to the key length and the target integer;

in step S550, the initial water quality data and the dummy data are encrypted by the first public key to obtain encrypted water quality data.

In step S510 of some embodiments, two different prime numbers are selected, resulting in a first prime number and a second prime number.

In step S520 of some embodiments, the first prime number is denoted as p and the second prime number is denoted as q. Multiplying the first prime number p and the second prime number q to obtain a key length n, a key length. The number of bits of n is the key length.

In step S530 of some embodiments, an integer e is selected as the target integer. The target integer is greater than one and less thanPreset integer threshold, i.e.Wherein->For a preset integer threshold, ++>. And the target integer e is +.>Mutually good quality.

In step S540 of some embodiments, a first public key is generated from the key length and the target integer, the first public key being denoted (n, e).

In step S550 of some embodiments, the initial water quality data and the dummy data are encrypted by the first public key, and the plaintext is converted into ciphertext, resulting in encrypted water quality data. The encryption computation is expressed asWherein (n, e) is a first public key, M is an integer converted from plaintext M of initial water quality data and pseudo data,/v >Mod represents a modulo operation, and C is ciphertext.

Step S160 may include, but is not limited to, steps S560 to S570:

step S560, generating a first private key according to the target integer, the preset integer threshold and the key length;

step S570, according to the transaction request, the encrypted water quality data is decrypted through the first private key, and decrypted water quality data is obtained.

In step S560 of some embodiments, a modulo-inverse element of the target integer is calculated according to a preset integer threshold, the modulo-inverse element satisfyingAn integer d of (2). Generating a first private key according to the modulo inverse element and the key length, the first private keyDenoted (n, d). The first public key and the first private key form a key pair.

In step S570 of some embodiments, the transaction request is a data acquisition request of encrypted water quality data, the transaction request carries identity information of a data acquirer, and if the identity information of the data acquirer characterizes that the data acquirer has authority to acquire the data, then the encrypted water quality data is decrypted by using a first private key according to the transaction request, so as to obtain decrypted water quality data. The decrypted water quality data includes a data identification tag that is used to distinguish between sensitive data and dummy data.

It will be appreciated that decryption is the conversion of ciphertext into plaintext, with the decryption calculation expressed as Where (n, d) is the first private key, mod represents modulo arithmetic, and c represents an integer from which ciphertext is converted. Ciphertext C is a non-negative integer between 0 and n-1, and ciphertext C is considered directly as integer C, which is the corresponding integer representation of ciphertext C. There is no need to modulo-n remainder c, since the remainder is already done during encryption, ensuring that c ranges from 0 to n-1. C is an integer representation of C and can be used directly in RSA decryption calculations.

Through the steps S510 to S570, the sensitive data can be further protected by encryption, so as to avoid abnormal modification of the sensitive data, and the data acquirer can acquire real, accurate and reliable data.

In the encryption process, a plaintext M needs to be converted into an integer M, the integer M is encrypted by using a first public key, and a ciphertext is calculated. The conversion process of the integer m is described in detail below.

Referring to fig. 6, in some embodiments, step S550 may include, but is not limited to, steps S610 to S640:

step S610, coding the initial water quality data and the pseudo data to obtain a coded water quality data stream;

step S620, grouping the coded water quality data stream according to the key length to obtain a plurality of water quality grouping data;

Step S630, carrying out data conversion on each water quality grouping data according to the key length to obtain candidate water quality data;

step S640, encrypting the candidate water quality data through the first public key to obtain encrypted water quality data.

In some embodiments, in step S610, the initial water quality data and the dummy data are packed into a whole data, and the data are encoded and converted into a binary bit stream, resulting in an encoded water quality data stream. Common coding modes include ASCII, unicode and UTF-8.

In step S620 of some embodiments, the encoded water quality data stream is grouped according to the binary number of bits of the key length, resulting in a plurality of water quality group data. The number of bits of the water quality packet data is the same as the number of binary bits of the key length. If the binary number of bits of the key length is 1024 bits, then the water quality packet data is 128 bytes, i.e., 1024 bits. Each water quality grouping data can be regarded as a non-negative integer ranging from 0 to。

In step S630 of some embodiments, the key length is n, modulo n is performed on each water quality packet data to obtain an integer between 0 and n-1, and candidate water quality data is determined. The candidate water quality data is an integer representation of the initial water quality data and the pseudo data.

In step S640 of some embodiments, the candidate water quality data is encrypted by the first public key based on the encryption calculation method, resulting in encrypted water quality data.

The above steps S610 to S640 facilitate encryption calculation by converting the plaintext into an integer. On the basis of generating the pseudo data, the initial water quality data and the pseudo data are further encrypted, so that double protection of sensitive data is realized.

The traditional water quality detection adopts a manual sampling and analysis method, and the method has the defects of inaccurate sampling, long analysis period, easy data modification and the like. In order to solve the problems, the water quality detection is generally carried out by adopting new technologies such as artificial intelligence, blockchain and the like, and the water treatment process is detected in real time, so that the water quality is ensured to reach the standard. The zero knowledge proof technology is an important technical means for protecting data privacy and guaranteeing data authenticity. The method and the device guarantee the credibility and privacy security of the data based on the zero knowledge proof technology. Zero knowledge proof techniques are described in detail below.

Referring to fig. 7, in some embodiments, step S140 may include, but is not limited to, steps S710 to S730:

step S710, obtaining public parameters and a second private key;

Step S720, signature calculation is carried out according to the second private key, the encrypted water quality data and the public parameters, and a preliminary signature is obtained;

and step S730, performing signature aggregation on all the preliminary signatures to obtain target signatures.

In step S710 of some embodiments, the plurality of participants P1, P2, pn collectively generate a common parameter using a secure multiparty computing protocol. The plurality of participants may be resident objects, corresponding to nodes within the computer. The public parameters include a verification key, which is typically a public key of a digital signature, and specification parameters, for verifying the correctness of the signature. The specification parameters define the standard format and verification method of zero knowledge proof, ensuring that all participants follow the same specification. The specification parameters mainly comprise a proving format description, a random number generation algorithm, a Hash function, a target group, a proving verification algorithm, a security parameter and the like. The second private key is used for signing the encrypted water quality data. A private key is generated for each participant, resulting in a second private key. The second private key may be generated according to an RSA asymmetric encryption algorithm, which is not described in detail herein.

In step S720 of some embodiments, a signature is calculated from the second private key ski, the encrypted water quality data mi, and the public parameter, resulting in a preliminary signature . The length of the preliminary signature is a constant and does not change as the number of signatures increases. Preliminary signature->C represents a promise,sthe constructed zero knowledge proof is represented, and the zero knowledge proof comprises the encrypted water quality data mi and the random number r.

Each signer generates a commitment, assuming that the signer's message is m and the random number is r, then commitment c is calculated as: c=com (m; r), where com is a deterministic com function, inputs message m and random number r, outputs commitments c, and selects random number r since each c is different. The signer needs to promise c to confirm his own signature, so that the zero knowledge proof thought is embodied, and for signature m and random number r, the signer only needs to prove messages m and r without revealing specific data.

In step S730 of some embodiments, the conventional signature scheme has a variable length signature value for each signature. As the number of signatures increases, the overall length of the aggregate signature also increases linearly. The extremely simple aggregation signature utilizes the technology of zero knowledge proof, any plurality of signatures can be aggregated into a constant-sized signature, even if the number of the signatures is large, the size of the aggregation signature is kept unchanged and is constant in length, so that the space and flow required by storage and communication are reduced, and the pressure can be relieved along with the increase of the number of the participants. The method and the device adopt a very simple aggregation signature algorithm to conduct signature aggregation on all the primary signatures to obtain target signatures. Promise of signing each preliminary signature And prove->Sequentially aggregating into a signature array, wherein each promise +.>The same size information is contained, so that the size of the target signature obtained by aggregation is constant and does not change with the increase of the number n of the participants.

Step S150 may include, but is not limited to, steps S740 to S750:

step S740, performing signature decomposition on the target signature to obtain a plurality of preliminary signatures;

step S750, each primary signature is checked according to the verification key.

In step S740 of some embodiments, after receiving the target signature, the verifier sequentially decomposes each individual signature from the target signature to obtain a plurality of preliminary signatures.

In step S750 of some embodiments, a verification algorithm is invoked to verify each preliminary signature. Promise of verification algorithm for each preliminary signature using verification key in common parametersAnd prove->And (5) checking. And if all the preliminary signatures pass verification, indicating that the target signature is valid. If any preliminary signature verification fails, the target signature is invalid. And verifying the encrypted data by using a zero knowledge proof technology, and ensuring the authenticity of the data without leaking the privacy information of the data. When the target signature is valid, the verification is passed, the encrypted water quality data passed by the verification is stored in the blockchain, and the non-tamper property and traceability of the data are ensured.

The intelligent contract code is compiled into a binary format by using a solubility development tool, and the intelligent contract is deployed on the blockchain to be a node on the blockchain, so that the intelligent contract code can be accessed through the node of the blockchain. The corresponding function is called in the contract, where a writeData () function is defined to store the signed data into defined variables, e.g. a uid to store numbers, string to store strings, or data is passed through the interface provided by the contract to an off-chain application. The out-of-chain application may perform secondary analysis, processing, presentation of the data, such as data statistics, training models, etc. And the out-of-chain response time is faster, and complex operation can be realized. The link-up and link-down synergy only saves the hash without storing the original data, thereby guaranteeing the privacy.

Analyzing the data stored on the blockchain, extracting valuable information, and providing basis for adjusting and optimizing the sewage treatment equipment. For example, if abnormality is found in the water quality parameters, equipment detection or pollutant investigation can be performed in time. Data may be extracted from variables stored by the contract by calling an interface function of the smart contract. The written data may also be extracted by parsing transaction and event logs on the blockchain. It should be noted that, each time the uplink is executed through the intelligent contract, an event is recorded through an emit log, the event log includes a contract address, an event name and an event parameter, and event details can be extracted through parsing the logs detopics and data. The extracted data may have redundancy, repetition, error, etc., and data cleaning is required to process missing values, smooth noise, unified format, etc. And applying the results of the model, the index and the like obtained by data analysis to the scenes of sewage treatment optimization, equipment fault prediction, pollution source positioning and the like. The analysis results are visually presented by using forms of reports, charts and the like. Through scientific system analysis of the blockchain data, the data value can be deeply mined, so that the sewage treatment is more intelligent and more refined.

The data analysis result and the signature verification result can be used for generating a detection report, and the data can be shared to other platforms, so that the water quality data can be comprehensively managed.

Through the steps S710 to S750, the encryption water quality data is signed and checked, so that the authenticity of the data can be ensured. The sewage detection method based on the zero knowledge proof technology can realize multiple functions of real-time detection, privacy safety, authenticity verification, non-tamper property, traceability and the like of data, and provides a brand-new detection means for water resource protection and sewage treatment.

Referring to fig. 8, an embodiment of the present application further provides a water quality data processing apparatus, which may implement the water quality data processing method, where the water quality data processing apparatus includes:

an acquisition module 810 for acquiring initial water quality data; the initial water quality data includes sensitive data;

the noise generation module 820 is configured to perform noise generation on the sensitive data through a preset noise generation network to obtain pseudo data;

the encryption module 830 is configured to encrypt the initial water quality data and the dummy data to obtain encrypted water quality data;

a signature module 840 for signing the encrypted water quality data to determine a target signature;

The signature verification module 850 is used for verifying the target signature and storing the encrypted water quality data passing through the signature verification into the blockchain;

the decryption module 860 is configured to decrypt the encrypted water quality data according to the transaction request in response to the transaction request for the blockchain, and obtain decrypted water quality data; decrypting the water quality data includes a data identification tag; the data identification tag is used to distinguish between sensitive data and dummy data.

The specific implementation of the water quality data processing device is basically the same as the specific embodiment of the water quality data processing method, and is not described herein.

The embodiment of the application also provides electronic equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the water quality data processing method when executing the computer program. The electronic equipment can be any intelligent terminal including a tablet personal computer, a vehicle-mounted computer and the like.

Referring to fig. 9, fig. 9 illustrates a hardware structure of an electronic device according to another embodiment, the electronic device includes:

the processor 910 may be implemented by a general-purpose CPU (central processing unit), a microprocessor, an application-specific integrated circuit (ApplicationSpecificIntegratedCircuit, ASIC), or one or more integrated circuits, etc. for executing related programs to implement the technical solutions provided by the embodiments of the present application;

Memory 920 may be implemented in the form of read-only memory (ReadOnlyMemory, ROM), static storage, dynamic storage, or random access memory (RandomAccessMemory, RAM). Memory 920 may store an operating system and other application programs, and when the technical solutions provided in the embodiments of the present application are implemented by software or firmware, relevant program codes are stored in memory 920, and the processor 910 invokes the water quality data processing method to execute the embodiments of the present application;

an input/output interface 930 for inputting and outputting information;

the communication interface 940 is configured to implement communication interaction between the present device and other devices, and may implement communication in a wired manner (e.g., USB, network cable, etc.), or may implement communication in a wireless manner (e.g., mobile network, WIFI, bluetooth, etc.);

a bus 950 for transferring information between components of the device (e.g., processor 910, memory 920, input/output interface 930, and communication interface 940);

wherein processor 910, memory 920, input/output interface 930, and communication interface 940 implement communication connections among each other within the device via a bus 950.

The embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the water quality data processing method when being executed by a processor.

The memory, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The embodiments described in the embodiments of the present application are for more clearly describing the technical solutions of the embodiments of the present application, and do not constitute a limitation on the technical solutions provided by the embodiments of the present application, and as those skilled in the art can know that, with the evolution of technology and the appearance of new application scenarios, the technical solutions provided by the embodiments of the present application are equally applicable to similar technical problems.

It will be appreciated by those skilled in the art that the technical solutions shown in the figures do not constitute limitations of the embodiments of the present application, and may include more or fewer steps than shown, or may combine certain steps, or different steps.

The above described apparatus embodiments are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

Those of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.

The terms "first," "second," "third," "fourth," and the like in the description of the present application and in the above-described figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

It should be understood that in this application, "at least one" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the above-described division of units is merely a logical function division, and there may be another division manner in actual implementation, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. The coupling or direct coupling or communication connection shown or discussed with each other may be through some interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.

The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including multiple instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of the various embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing a program.

Preferred embodiments of the present application are described above with reference to the accompanying drawings, and thus do not limit the scope of the claims of the embodiments of the present application. Any modifications, equivalent substitutions and improvements made by those skilled in the art without departing from the scope and spirit of the embodiments of the present application shall fall within the scope of the claims of the embodiments of the present application.

Claims

1. A water quality data processing method, characterized in that the method comprises:

signing the encrypted water quality data to determine a target signature;

2. The water quality data processing method according to claim 1, wherein the preset noise generation network is trained according to the following steps:

acquiring sample water quality data and randomly generating a noise sequence;

obtaining a guide vector;

calculating to generate countermeasures loss data;

3. The water quality data processing method according to claim 2, wherein the acquiring the guidance vector includes:

acquiring the trained times of the preset noise generation network;

if the trained times are zero, randomly generating the guide vector;

4. The water quality data processing method according to claim 2, wherein the calculating generates the counter loss data, comprising:

or,

5. The method according to any one of claims 1 to 4, wherein encrypting the initial water quality data and the dummy data to obtain encrypted water quality data comprises:

acquiring a first prime number and a second prime number;

6. The method for processing water quality data according to claim 5, wherein encrypting the initial water quality data and the dummy data by the first public key to obtain the encrypted water quality data comprises:

7. The method of any one of claims 1 to 4, wherein signing the encrypted water quality data to determine a target signature comprises:

obtaining public parameters and a second private key;

and verifying each primary signature according to the verification key.

8. A water quality data processing apparatus, the apparatus comprising:

9. An electronic device comprising a memory and a processor, the memory storing a computer program, the processor implementing the water quality data processing method of any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the water quality data processing method of any one of claims 1 to 7.