Disclosure of Invention
The embodiment of the invention aims to provide a gene big data disease prediction system and a prediction method based on 5G and a block chain, aiming at improving the safety of gene data. The specific technical scheme is as follows:
in a first aspect of embodiments of the present invention, there is provided a gene big data disease prediction system based on 5G and blockchain, the system comprising: the system comprises a user terminal, a block chain node point server and a disease prediction server, wherein the user terminal, the block chain node point server and the disease prediction server are connected with each other through a 5G communication network;
the user terminal is configured to: sending a disease prediction request to the disease prediction server through the 5G communication network, wherein the disease prediction request carries a user identity and encrypted gene data of a user, and the encrypted gene data is obtained by encrypting original gene data of the user by using a public key of the user;
after the disease prediction server receives the disease prediction request, the disease prediction server is configured to: generating a private key query request according to the user identity in the disease prediction request, wherein the private key query request carries the user identity;
the disease prediction server is further configured to: signing the private key query request by using a private key of the private key, and sending the signed private key query request to the block link point server through the 5G communication network;
after the block chain node server receives the private key query request, the block chain node server is configured to: the public key of the disease prediction server is utilized to carry out signature verification on the private key inquiry request, after the signature verification is passed, the user identity carried in the private key inquiry request is taken as an index, a user private key corresponding to the user identity is inquired from a local account book database, and then the inquired user private key is sent to the disease prediction server;
after the disease prediction server receives the user private key, the disease prediction server is further configured to: decrypting the encrypted gene data by using the user private key to obtain original gene data, and predicting diseases by using the original gene data to obtain a disease prediction result;
the disease prediction server is further configured to: and sending the disease prediction result to the user terminal.
In a second aspect of the embodiments of the present invention, there is provided a 5G and blockchain-based gene big data disease prediction method applied to a disease prediction server, the method including:
receiving a disease prediction request sent by a user terminal through a 5G communication network, wherein the disease prediction request carries a user identity and encrypted gene data of a user, and the encrypted gene data is obtained by encrypting original gene data of the user by using a public key of the user;
responding to the disease prediction request, and generating a private key query request according to a user identity in the disease prediction request, wherein the private key query request carries the user identity;
signing the private key inquiry request by using a private key of the private key, sending the signed private key inquiry request to a block chain node server through the 5G communication network, so that the block chain node server performs signature verification on the private key inquiry request by using a public key of the disease prediction server, inquiring a user private key corresponding to a user identity from an account database of a local place by using the user identity carried in the private key inquiry request as an index after the signature verification is passed, and then returning the inquired user private key to the disease prediction server;
receiving a user private key returned by the block chain node server, decrypting the encrypted gene data by using the user private key to obtain original gene data, and predicting diseases by using the original gene data to obtain a disease prediction result;
and returning the disease prediction result to the user terminal through a 5G communication network.
In the invention, encrypted gene data instead of original gene data is carried in a disease prediction request sent by a user terminal to a disease prediction server. Therefore, the original gene data can be prevented from being intercepted by hackers, and the safety of the original gene data is improved. After receiving the disease prediction request, the disease prediction server queries a user private key from the block link point server, decrypts the encrypted gene data by using the queried user private key to obtain original gene data, and performs disease prediction by using the original gene data to finally obtain a disease prediction result. Thus, remote disease diagnosis is achieved.
In addition, in the invention, the disease prediction server generates a private key query request in order to query the private key of the user from the block link point server, and signs the private key query request by using the private key of the disease prediction server. And after receiving the private key query request, the block chain node server performs signature verification on the private key query request by using the public key of the disease prediction server. And under the condition that the signature verification is passed, the block chain node server sends the user private key to the disease prediction server. Therefore, a hacker can be prevented from impersonating the disease prediction server and illegally obtaining the user private key from the block link point server.
It should be further noted that, in the present invention, the private key of the user is registered through the block chain, and the non-tamper property of the block chain technology is utilized, which is beneficial to preventing hackers from tampering and maling the private key of the user, thereby further improving the security of the gene data.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
With the application and popularization of the mobile internet, more and more users begin to conduct business online by means of mobile terminals. Taking the diagnosis of diseases by a remote medical system as an example, a patient user generally needs to upload medical data (such as a focus image, gene data, and the like) of the patient user to a disease prediction server by using a user terminal, and the disease prediction server makes a corresponding disease prediction according to the medical data uploaded by the user terminal.
In view of the above, the present invention provides a 5G and blockchain-based gene big data disease prediction system and prediction method, aiming at improving the safety of gene data.
Referring to fig. 1, fig. 1 is a schematic diagram of a 5G and blockchain-based gene big data disease prediction system according to an embodiment of the present invention. As shown in fig. 1, the prediction system includes: the system comprises a user terminal, a block chain node server and a disease prediction server. The user terminal, the block link point server and the disease prediction server are connected with each other through a 5G communication network.
In the invention, the block chain link point server and a plurality of other block chain link point servers jointly form a block chain network, and the block chain network is used for realizing a block chain technology during the operation.
As shown in fig. 1, the user terminal is configured to: and sending a disease prediction request to the disease prediction server through the 5G communication network, wherein the disease prediction request carries a user identity and encrypted gene data of the user, and the encrypted gene data is obtained by encrypting original gene data of the user by using a public key of the user.
In specific implementation, after obtaining original gene data of a user, a user terminal encrypts the original gene data by using a public key of the user to obtain encrypted gene data. And then the user terminal assembles the user identity identification and the encrypted gene data into a disease prediction request and sends the disease prediction request to a disease prediction server.
As shown in fig. 1, after the disease prediction server receives the disease prediction request, the disease prediction server is configured to: and generating a private key query request according to the user identity in the disease prediction request, wherein the private key query request carries the user identity. The disease prediction server is further configured to: and signing the private key query request by using a private key of the private key, and sending the signed private key query request to the block chain node server through the 5G communication network.
During specific implementation, after a disease prediction server receives a disease prediction request sent by a user terminal, a user identity is read from the disease prediction request, then the user identity is assembled into a private key query request, then the private key query request is signed by using a private key of the disease prediction server, and finally the signed private key query request is sent to a blockchain node server.
As shown in fig. 1, after the blockchain node server receives the private key query request, the blockchain node server is configured to: and after the signature verification is passed, the user private key corresponding to the user identity is inquired from a local account book database by taking the user identity carried in the private key inquiry request as an index, and then the inquired user private key is sent to the disease prediction server.
In specific implementation, the public key corresponding to the user identity is stored in the account book database of the block link point server, and the public key of the disease prediction server is also stored. After the block link point server receives the private key query request sent by the disease prediction server, the public key of the disease prediction server is firstly called from the account book database, and then the public key is used for carrying out signature verification on the private key query request. If the signature is verified, the private key inquiry request is sent by the disease prediction server instead of the hacker. Therefore, after the signature verification is passed, the block link point server reads the user identity from the private key query request, and then reads the user private key corresponding to the user identity from the account book database by taking the user identity as an index. And finally, the block chain node server returns the user private key to the disease prediction server.
As shown in fig. 1, after the disease prediction server receives the user private key, the disease prediction server is further configured to: and decrypting the encrypted gene data by using the user private key to obtain original gene data, and predicting diseases by using the original gene data to obtain a disease prediction result.
Optionally, in some embodiments, the disease prediction server, when used for disease prediction, is specifically configured to: determining gene sites generating gene mutation from original gene data, combining all gene sites generating gene mutation into a gene mutation vector, and inputting the gene mutation vector into a pre-trained disease prediction model, thereby obtaining a disease prediction result output by the disease prediction model.
In particular implementations, after one or more genetic loci that produce a genetic mutation are determined from the raw genetic data, each genetic locus is converted to a digital expression. For example, the number of a gene locus in a gene sequence can be expressed as the number of the gene locus. The numerical expression of one or more gene loci is then arranged into a gene mutation vector. Then, the gene mutation vector is filled in to a preset length using the number 0. For example, if the gene mutation vector arranged by the numerical expression of the gene locus is 12 bits and the gene mutation vector of a predetermined length should be 50 bits, the 12-bit gene mutation vector should be filled up to 50 bits with the number 0. And finally, inputting the gene mutation vector with a preset length into a pre-trained disease prediction model, thereby obtaining a disease prediction result output by the disease prediction model.
For example, the disease prediction model may be a disease prediction model for detecting liver cancer, and the disease prediction model outputs a disease prediction result in the form of "yes" or "no". And if the disease prediction result output by the disease prediction model is 'no', the user is indicated to not suffer from liver cancer.
As shown in fig. 1, the disease prediction server is further configured to: and sending the disease prediction result to the user terminal.
In the invention, encrypted gene data instead of original gene data is carried in a disease prediction request sent by a user terminal to a disease prediction server. Therefore, the original gene data can be prevented from being intercepted by hackers, and the safety of the original gene data is improved. After receiving the disease prediction request, the disease prediction server queries a user private key from the block link point server, decrypts the encrypted gene data by using the queried user private key to obtain original gene data, and performs disease prediction by using the original gene data to finally obtain a disease prediction result. Thus, remote disease diagnosis is achieved.
In addition, in the invention, the disease prediction server generates a private key query request in order to query the private key of the user from the block link point server, and signs the private key query request by using the private key of the disease prediction server. And after receiving the private key query request, the block chain node server performs signature verification on the private key query request by using the public key of the disease prediction server. And under the condition that the signature verification is passed, the block chain node server sends the user private key to the disease prediction server. Therefore, a hacker can be prevented from impersonating the disease prediction server and illegally obtaining the user private key from the block link point server.
It should be further noted that, in the present invention, the private key of the user is registered through the block chain, and the non-tamper property of the block chain technology is utilized, which is beneficial to preventing hackers from tampering and maling the private key of the user, thereby further improving the security of the gene data.
Optionally, in some specific embodiments, as shown in fig. 1, the user terminal is further configured to: and submitting a private key storage transaction to the blockchain node server in advance through the 5G communication network, wherein the private key storage transaction carries a user private key and a user identity of the user.
After the block link node server receives the private key storage transaction, the block link node server is further configured to: and broadcasting the private key storage transaction in a block chain network, and correspondingly storing the user private key and the user identity in the private key storage transaction in a local account book database.
The blockchain node server is further configured to: signing a preset first character string by using the user private key to obtain a second character string; signing the second character string by using a private key of the block chain node server to obtain a third character string; and finally, sending the third character string to the user terminal.
After receiving the third character string, the user terminal performs signature verification on the third character string by using the public key of the block chain node server to obtain a fourth character string; carrying out signature verification on the fourth character string by using the public key of the user, and obtaining a fifth character string; and finally, judging whether the fifth character string is equal to the preset first character string, and if so, determining that the block link point server stores the user private key into an account book database.
For ease of understanding, assume, by way of example, that the preset first character string is "abcde". After the block chain node point server correspondingly stores the user private key and the user identity in a local account book database, the block chain node point server signs the abcde by using the user private key to obtain a second character string. And then the block chain node point server signs the second character string by using the private key of the block chain node point server to obtain a third character string. And after receiving the third character string, the user terminal performs signature verification on the third character string by using the public key of the block chain node server to obtain a fourth character string. And then, signature verification is carried out on the fourth character string by using the public key of the user, and a fifth character string is obtained. If the fifth string happens to be "abcde". It indicates that the blockchain node server has actually recorded the user's private key to the ledger database and that no hacker has done so in the meantime. Therefore, the safety of the private key of the user is ensured, and the safety of the gene data is ensured.
Optionally, in some embodiments, the disease prediction server is further configured to train the disease prediction model by: collecting sample gene data of a plurality of sample users with diseases, determining gene sites generating gene mutation from each sample gene data according to each sample gene data, and forming the gene sites into a sample gene mutation vector; training the BP neural network by taking a plurality of sample gene mutation vectors as training data; and determining the trained BP neural network as the disease prediction model.
Preferably, in the training process of the BP neural network, the initial weight and the threshold of the BP neural network are optimized by adopting a particle swarm algorithm.
In the calculation process of the particle swarm optimization, the speed and the position of the particles are updated according to the following formula:
VI(τ)=ωVI(τ-1)+z1r1(PI(τ-1)-XI(τ-1))+z2r2(P(τ-1)-XI(τ-1))
XI(τ)=XI(τ-1)+VI(τ)
in the formula, ztAnd z2Is a learning factor, ω is an inertial weight factor, r1And r2Is a random number, and r1,r2E (0, 1), let lIDenotes the I-th particle, V, in the particle populationI(τ -1) and XI(τ -1) are particles l, respectivelyIVelocity and position, V, after the (τ -1) th iteration updateI(τ) and XI(τ) are particles l, respectivelyISpeed and position, P, after the τ th iteration updateI(τ -1) represents a particle lIP (tau-1) represents the global optimal position of the particle swarm after the (tau-1) th iteration updating.
After each iteration update of the particle swarm algorithm, local strengthening optimization particles are selected from the particle swarm to carry out local optimization, and the method specifically comprises the following steps:
(1) selecting local reinforced optimization particles in the particle swarm, setting O (tau) to represent the local reinforced optimization particle set selected in the particle swarm after the tau iteration update,
representing the mean value of the fitness function of the particle swarm after the tau iteration update, and enabling the fitness function value of the particles in the particle swarm after the tau iteration update to be smaller than
As candidate particles for locally enhancing the optimization particles; let O ' (tau) represent the candidate particle set of local enhancement optimization particles selected in the particle swarm after the iteration update of the tau, select the candidate particle with the minimum fitness function value in the set O ' (tau) as the first local enhancement optimization particle to be added into the set O (tau), delete the selected local enhancement optimization particles in the set O ' (tau), and search according to the selected local enhancementThe optimal particle is used for screening candidate particles in the set O' (tau), and specifically comprises the following steps:
l 'is'
GDenotes the G-th candidate particle in the set O '(τ), and l'
GFor the first selected locally reinforcing optimizing particle, X'
G(τ) represents particle l'
GLocation updated at the τ th iteration, Ω'
G(τ) represents particle l'
GLocating the local neighborhood Ω ' in the set O ' (τ) after the τ -th iteration update '
GThe candidate particles in (τ) are deleted from the set O '(τ), where Ω'
G(τ) is a particle l'
GA local region centered at d (τ) as a radius, d (τ) being a neighborhood radius of the particle swarm updated at the τ -th iteration, and
d (0) is the initial neighborhood radius, τ is the current iteration update times, T
maxUpdating the maximum iteration number;
continuously selecting the candidate particles with the minimum fitness function value from the rest candidate particles in the set O ' (tau) according to the method as local enhancement optimizing particles to be added into the set O (tau), deleting the selected local enhancement optimizing particles from the set O ' (tau), and screening the candidate particles in the set O ' (tau) according to the selected local enhancement optimizing particles;
stopping the selection of the local enhanced optimization particles until no candidate particles exist in the set O' (tau), wherein the particles in the set O (tau) are the local enhanced optimization particles selected in the particle swarm;
(2) setting local enhanced optimization particles in the set O (tau) to perform local optimization in the following way:
is provided with
Represents the S-th locally enhanced optimization particle in the set O (tau),
indicating particle
At the location updated at the τ th iteration,
indicating particle
Local neighborhood after the τ th iteration update, an
To a position
A local neighborhood centered at d (τ) as radius
Representing local neighborhoods
After the τ -th iterative update of the particle, in the local neighborhood
In the random selection of the position
And
and particles were produced according to the following formula
New sub-position of (2):
in the formula (I), the compound is shown in the specification,
indicating particle
The new sub-position generated after the tau-th iteration update,
representing local neighborhoods
The position with the minimum fitness function value is set
Representing local neighborhoods
The number K particles in (a) are,
representing local neighborhoods
The number L of particles in (b) is,
indicating particle
At the location updated at the τ th iteration,
indicating particle
Position updated at the τ th iteration;
is provided with
Representing local neighborhoods
Is updated at the τ th iteration, and
wherein the content of the first and second substances,
indicating particle
At the location updated at the τ th iteration,
representing local neighborhoods
The number of particles in (a); is provided with
Representing local neighborhoods
A new set of sub-positions generated by the mesoparticle after the τ -th iteration update, and
defining a local neighborhood
The detection function after the τ th iteration is updated to
Then
The expression of (a) is:
in the formula (I), the compound is shown in the specification,
representing local neighborhoods
The optimal spatial detection coefficient of the optical fiber,
representing the detection coefficients of the optimization space
Is determined as a function of
Representing local neighborhoods
The detection coefficient of the optimizing performance of the system,
detection coefficient for indicating optimizing performance
Is determined as a function of
Is provided with
Representing local neighborhoods
The number Z of particles in (1) is,
indicating particle
At the location updated at the τ th iteration,
indicating particle
The new sub-position generated after the tau-th iteration update,
indicating a location
The value of the corresponding fitness function is calculated,
indicating new sub-positions
A corresponding fitness function value;
when detecting a function
Then local neighborhood
Keeping the position of the middle particle after the tau iteration updating unchanged; when detecting a function
Then local neighborhood
The position of the medium particle is transformed into a new sub-position generated after the tau iteration update.
In the preferred embodiment, aiming at the condition that the convergence speed and the prediction accuracy of the BP neural network are easily influenced by the initial weight and the threshold, an improved particle swarm algorithm is adopted to optimize the initial weight and the threshold of the BP neural network, in the improved particle swarm algorithm, after each iteration update of a particle swarm, particles with higher optimization performance and more dispersed distribution are selected from the particle swarm as local enhanced optimization particles to enhance local optimization, a set local neighborhood search strategy of the local enhanced optimization particles can effectively search a local neighborhood, meanwhile, the diversity of particle positions is increased, each particle in the local neighborhood generates a new sub-position set in the local neighborhood according to the local neighborhood search strategy, a detection function of the local neighborhood is defined, and an optimization space detection coefficient in the detection function can effectively judge whether the new sub-position generated by the particle in the local neighborhood is larger than the original position of the particle in the local neighborhood or not In space, the optimizing performance detection coefficient in the detection function can effectively judge whether the new positions of the particles produced in the local neighborhood are better than the original positions of the particles in the local neighborhood or not, so that the particles in the local neighborhood are enabled to select the positions in a better position set according to the result of the detection function, the local optimizing precision of the particle swarm optimization is enhanced, the defects that the local searching capability of the particle swarm optimization is poor and the particles are easy to fall into local extreme values are overcome, and the optimizing capability is better.
For ease of understanding, it is assumed, by way of example, that a disease prediction model for predicting liver cancer is to be trained. Sample genetic data is collected for a plurality (e.g., 200) of sample users with liver cancer disease. For each sample gene data, gene sites generating gene mutation are determined from the sample gene data, and the gene sites are combined into a sample gene mutation vector. The sample gene mutation vector is then padded to a preset length of the sample gene mutation vector using the number 0. And finally, training the preset BP neural network by using the sample gene mutation vector with the preset length. And finally determining the trained BP neural network as a disease prediction model for detecting liver cancer.
Optionally, in some specific embodiments, as shown in fig. 1, before the disease prediction server sends the disease prediction result to the user terminal, the disease prediction server is further configured to: and signing the disease prediction result by using the user private key.
After the user terminal receives the signed disease prediction result, the user terminal is further configured to: and carrying out signature verification on the signed disease prediction result by using the public key of the user.
In the invention, the user public key is only mastered in the user terminal, and the user private key is only mastered in the block link point server and the disease prediction server, so that the disease prediction server signs the disease prediction result by using the user private key, namely, the disease prediction result is encrypted, and only the user terminal can decrypt by using the user public key. Therefore, the disease prediction result is prevented from being disclosed, and the privacy security of the user is improved.
In addition, the disease prediction server signs the disease prediction result by using the user private key, so that the disease prediction server can be proved to obtain the user private key from the block link point server, and further, the whole process is proved not to be attacked by hackers.
Based on the same invention concept, the invention provides a gene big data disease prediction method based on 5G and a block chain. Referring to fig. 2, fig. 2 is a flowchart of a 5G and blockchain-based gene big data disease prediction method according to an embodiment of the present invention, and the prediction method is applied to a disease prediction server. It should be noted that the prediction method shown in fig. 2 can be cross-referenced with the prediction system shown in fig. 1.
As shown in fig. 2, the prediction method includes the following steps:
step S21: receiving a disease prediction request sent by a user terminal through a 5G communication network, wherein the disease prediction request carries a user identity and encrypted gene data of a user, and the encrypted gene data is obtained by encrypting original gene data of the user by using a public key of the user.
Step S22: responding to the disease prediction request, and generating a private key query request according to the user identity in the disease prediction request, wherein the private key query request carries the user identity.
Step S23: the private key inquiry request is signed by using a private key of the private key, the signed private key inquiry request is sent to a block chain node server through the 5G communication network, the block chain node server conducts signature verification on the private key inquiry request by using a public key of the disease prediction server, after the signature verification is passed, a user private key corresponding to the user identity is inquired from a local account database by taking the user identity carried in the private key inquiry request as an index, and the inquired user private key is returned to the disease prediction server.
Step S24: and receiving a user private key returned by the block chain node server, decrypting the encrypted gene data by using the user private key to obtain original gene data, and predicting diseases by using the original gene data to obtain a disease prediction result.
Step S25: and returning the disease prediction result to the user terminal through a 5G communication network.
Optionally, in some embodiments, the user private key is previously stored in the ledger database of the blockchain node server by the user terminal through a private key deposit transaction.
Optionally, in some specific embodiments, the step S24 specifically includes the following sub-steps:
substep S24-1: determining gene sites generating gene mutation from original gene data, and combining all gene sites generating gene mutation into a gene mutation vector;
substep S24-2: and inputting the gene mutation vector into a pre-trained disease prediction model so as to obtain a disease prediction result output by the disease prediction model.
Optionally, in some embodiments, the disease prediction model is trained by: collecting sample gene data of a plurality of sample users with diseases, determining gene sites generating gene mutation from each sample gene data according to each sample gene data, and forming the gene sites into a sample gene mutation vector; training a preset BP neural network by taking a plurality of sample gene mutation vectors as training data; and determining the trained BP neural network as the disease prediction model.
Optionally, in some specific embodiments, before sending the disease prediction result to the user terminal, the method further includes: and signing the disease prediction result by using the user private key.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.