CN116522415B - System for realizing safe storage and sharing of medical big data - Google Patents

System for realizing safe storage and sharing of medical big data Download PDF

Info

Publication number
CN116522415B
CN116522415B CN202310440541.5A CN202310440541A CN116522415B CN 116522415 B CN116522415 B CN 116522415B CN 202310440541 A CN202310440541 A CN 202310440541A CN 116522415 B CN116522415 B CN 116522415B
Authority
CN
China
Prior art keywords
engine
data
database
storage
sharing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310440541.5A
Other languages
Chinese (zh)
Other versions
CN116522415A (en
Inventor
黄艺海
甘晨
林煌
陈炯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Qianyun Data Technology Co ltd
Original Assignee
Hangzhou Qianyun Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Qianyun Data Technology Co ltd filed Critical Hangzhou Qianyun Data Technology Co ltd
Priority to CN202310440541.5A priority Critical patent/CN116522415B/en
Publication of CN116522415A publication Critical patent/CN116522415A/en
Application granted granted Critical
Publication of CN116522415B publication Critical patent/CN116522415B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/78Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure storage of data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/176Support for shared access to files; File sharing support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2219Large Object storage; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Computational Linguistics (AREA)
  • Storage Device Security (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a system for realizing safe storage and sharing of medical big data. The system comprises a database pre-engine, a storage pre-engine, a cloud storage access engine, a key management system and a third party data sharing engine; the database pre-engine and the storage pre-engine are respectively connected in series with the database and the storage link of the existing IT system of the medical institution, and the access of all the IT systems to the database and the storage is filtered; the cloud storage access engine periodically requests data from the database front engine, processes, encapsulates and compresses the data, requests an encryption key from the key management system, encrypts the data and then uploads the encrypted data to the cloud storage; and the third party data sharing engine submits a sharing request to the cloud storage, waits for verification, acquires data from the cloud storage if the verification passes, and writes the data into the third party database and the third party storage after re-decrypting and decompressing the data. The application can ensure that medical big data is safely stored in the cloud storage, and realize encryption sharing and use among a plurality of medical institutions.

Description

System for realizing safe storage and sharing of medical big data
Technical Field
The application relates to a system for realizing safe storage and sharing of medical big data.
Background
Medical institutions, particularly large hospitals, now have become an important support for daily work, and various information systems have been widely used in medical institutions and the digitization of medical equipment and instruments, so that the data information capacity of hospitals is continuously expanded, and the precious medical information resources are very valuable for the management, control and medical research of diseases. However, IT construction in the medical and health industry has certain complexity and specificity as a traditional industry. In any large-scale hospital, tens of thousands of patients are diagnosed every day, and basic information, image information and pathological information of the patients and other special diagnosis and treatment information are collected together to form a huge data set. The data volume is multiplied by geometric figures and brings great pressure to applications such as data storage, integration, calling and the like of hospitals.
Besides huge data scale, the data types and structures of the medical industry are extremely complex, such as unstructured data of businesses of PACS images, B-ultrasound, pathological analysis and the like, and database structured data corresponding to the data, so that the storage structure is extremely complex, and the traditional processing method and technology bring great challenges.
Meanwhile, at the present day of big data development, governments and medical institutions are also becoming more and more aware of the importance of breaking the information island of a single medical institution, so that repeated examination generated in the process of multi-place consultation can be greatly reduced for patients if the medical data can be shared, and the consultation of different-place specialists can be more conveniently and rapidly carried out.
In addition, the big data technology has important strategic significance in specialized treatment of various medical and health data, for example, the big data can be used for improving disease forecasting and early warning capability for public health, and epidemic outbreak is prevented.
In the new drug development stage, the most effective input-output ratio can be determined through big data modeling and analysis, so that the optimal resource combination is provided. Personalized medicine may also improve healthcare effects by providing personalized medicine protocols for analysis of large data sets (e.g., genomic data), such as providing early detection and diagnosis before a patient develops symptoms of a disease.
However, medical data sharing involves personal privacy and data security problems, which also hamper the practical application of large medical data today, and how to solve the problem of sharing is also a very complex problem in view of the massive and complex nature of large medical data.
Disclosure of Invention
The application aims at the problems and provides a system for realizing safe storage and sharing of medical big data, which can ensure that the medical big data is safely stored in cloud storage, and simultaneously provides a mode of authorization and automatic desensitization during use to share and use the medical big data among a plurality of medical institutions.
Therefore, the application adopts the following technical scheme: a system for implementing secure storage and sharing of medical big data, comprising: the system comprises a database pre-engine, a storage pre-engine, a cloud storage access engine, a key management system and a third party data sharing engine;
the database pre-engine and the storage pre-engine are respectively connected in series with the database and the storage link of the existing IT system of the medical institution, and the access of all the IT systems to the database and the storage is filtered;
the cloud storage access engine periodically requests incremental structured data from the database pre-engine, simultaneously requests incremental unstructured data from the storage pre-engine, processes, encapsulates and compresses the data, requests an encryption key from the key management system, encrypts the data and then uploads the encrypted data to the cloud storage;
the third party data sharing engine submits a sharing request to the cloud storage, the cloud storage access engine receives a message triggered by the request, waits for verification, acquires data from the cloud storage if the verification passes, and writes the data into the third party database and unstructured writing into the third party storage after the data are decrypted and decompressed.
Preferably, the database pre-engine and the storage pre-engine judge whether the request can be submitted to the back-end database and the storage according to a pre-configured strategy, and judge whether the database and the stored return data need to be subjected to desensitization processing according to the source of the request and the strategy.
Preferably, the database pre-engine comprises an engine database protocol analysis module, the engine database protocol analysis module can analyze the type of the database server and the version of the database server software, the database pre-engine uses the detected type version to carry out deep analysis on the database access protocol based on TCP/IP, the engine database protocol analysis module maintains a session search tree which is successfully logged in, and records related information of the database on the session nodes, and when logging out, the corresponding session nodes are deleted from the tree.
Preferably, the database pre-engine receives accessed TCP/IP protocol data, firstly searches for session nodes on a corresponding session tree, extracts SQL statements from traffic according to the characteristics of the type version of a target database, then the engine analyzes the syntax tree of the extracted SQL statements and classifies the SQL operations, divides the SQL operations into four types of insert data records, modify data records, read data records and other operations, finally submits the SQL operations and the classifications to a database execution unit, the execution unit submits the information to a filtering policy engine after taking the information, the filtering policy engine judges whether the operations can be executed according to a predefined database security policy and a database security policy, and if yes, the execution unit submits a back-end database, otherwise, sends alarm information to an administrator.
Preferably, the database pre-engine further comprises a database result set analysis module and a database desensitization module, wherein the database result set analysis module analyzes the protocol flow to obtain the query result returned by the database, and sends the query result to the database desensitization module;
the database desensitization module adopts a text search replacement method of a finite state machine according to a filtering strategy engine, data related to a privacy part in a result set are subjected to desensitization treatment, and after the desensitization treatment, a database protocol packaging module packages returned data according to a detected database type and a protocol according to session information in a session tree and returns the data to an IT system.
Preferably, when the storage front engine receives a TCP connection initiated by the IT system to the storage system, the engine generates a session node for the connection, establishes a session search tree, allocates a session temporary write operation area for the session, records the address of the temporary write operation area in the session node, and releases the allocated space areas when the connection disconnection session is finished.
Preferably, when the file is continuously written into the storage system by the IT system, the storage front engine queries the corresponding session node information through the search tree, acquires the corresponding session temporary writing operation area, and triggers the filtering policy engine to filter and check the file information until the file is completely stored after the file is written into the temporary writing operation area and the file is completely stored, and if the file passes the checking, the file is written into the storage system, otherwise, the file stored in the temporary writing operation space is only read and written by the current session.
Preferably, when the reading operation is initiated by the current session, the front engine is stored to inquire the corresponding session node information through a search tree, the corresponding session temporary writing operation area is obtained, then whether a target file exists in the area is checked, if yes, the file is put into a file data reading buffer, if not, the file is stored to a system at the rear end and is also put into the file data reading buffer, after the file data is ready in the reading buffer, the filtering strategy engine is triggered to see whether the desensitization of the file content data is needed, and if so, the desensitization processing is carried out on the data of the privacy part by adopting a text searching replacement method of a finite state machine.
Preferably, the cloud storage access engine periodically requests incremental data from the database pre-engine and the storage pre-engine, the database pre-engine and the storage pre-engine respectively perform unstructured processing on the obtained incremental result set to obtain unstructured files with complete information of the result set and unstructured files after desensitization of the result set, respectively perform exclusive or processing on the two files according to bits, respectively compress the two files, and the cloud storage access engine periodically acquires the compressed files, encrypts the two files, and fragments and uploads the encrypted files to the cloud storage.
Preferably, one part of the compressed file is desensitized, the other part is the result of bitwise exclusive or, the two parts are sequentially applied to the key management system by taking the current timestamp and the uuid of the device as factors, and then the two parts of the compressed file are respectively encrypted by the two keys.
Preferably, the system also comprises a private key, wherein the private key issues two certificates, the two corresponding key factors are respectively packaged in the two certificates, and the certificates are stored in the system.
Preferably, the third party sharing engine sends a sharing request to the cloud storage access engine in a cloud storage subscription mode, if the request passes the audit, the cloud storage access engine opens read-only permission of corresponding data of the third party sharing engine on the cloud storage, and sends an issuing key factor certificate of the corresponding desensitized data to the third party sharing engine;
after downloading all corresponding desensitization data fragments, the third party engine verifies the certificate by using the public key, requests the key from the key management system by using factors in the certificate, then decrypts the decompressed desensitization data, and stores the desensitization data in the database and the storage system of the user.
The application periodically compresses and encapsulates the internal structured and unstructured data of the medical institutions, ensures that the medical big data are safely stored in cloud storage in a dynamic encryption mode during transmission, and simultaneously provides authorization and automatic desensitization modes during use to share and use the medical big data among a plurality of medical institutions.
Drawings
Fig. 1 is a schematic diagram of a system structure according to the present application.
FIG. 2 is a schematic diagram of an implementation of a database pre-engine according to the present application.
FIG. 3 is a schematic diagram of an implementation of the storage front engine of the present application.
Detailed Description
The application will be further illustrated with reference to specific examples.
1. Introducing the whole system:
as shown in FIG. 1, the medical big data safe storage and sharing system comprises a database pre-engine, a storage pre-engine, a cloud storage access engine, a key management system and a third party data sharing engine. The third party data sharing engine is an interface which operates outside the system and provides data sharing for the third party data sharing.
The database pre-engine and the storage pre-engine are respectively connected in series with the database and the storage link of the existing IT system of the medical institution, and the access of all the IT systems to the database and the storage is filtered.
Both pre-engines judge whether the request can be submitted to a back-end database and stored according to a pre-configured strategy, and judge whether the database and the stored return data need to be subjected to desensitization processing according to the source of the request and the strategy.
The cloud storage access engine periodically requests incremental structured data from the database pre-engine, simultaneously requests incremental unstructured data from the storage pre-engine, processes, encapsulates and compresses the data, requests an encryption key from the key management system, encrypts the data and then uploads the encrypted data to the cloud storage.
When the third party data sharing engine submits the sharing request, the cloud storage access engine receives the message triggered by the request, waits for manual auditing or automatic policy auditing whether the request passes, and if the auditing passes, opens the data read-only permission and issues a temporary certificate to the third party data visitor. And the third party data sharing engine acquires data from the cloud storage, and writes the data into the third party database and unstructured writes into the third party storage after decrypting and decompressing the data reorganization.
2. Database pre-engine implementation
As shown in fig. 2, the database pre-engine, as a filtering engine, needs to be in-line between the links of both the medical facility's IT system and the supporting database, all database access traffic will flow through the engine. In order not to affect the use of the medical IT system, the engine cannot change the protocol of the original IT system for accessing the database, that is, the original core database in the medical institution is ORACLE, so the database pre-engine needs to support the IT system or use the ORACLE protocol for accessing the database, and similarly if the database is MSSQLServer, mySQL, PG or DB2, or is a domestic dream and miracle database, etc., the database pre-engine also needs to support the IT system to still use the original database protocol for accessing the core databases.
Because the access protocol flow data of the database can be sent to the database server according to the strategy after being terminated by the engine, and the return flow of the database server can be sent to the IT system according to the strategy after being terminated by the engine, the engine needs to perform deep detection and analysis on the life cycle of logging in, accessing and logging out the whole connection from the connection.
When the database access connection of the IT system reaches the engine, the engine only accepts the connection and does not do any operation, waits for the IT system to send a login request, and the engine database protocol analysis module analyzes the type of the database server (namely which database server software) and forwards the login request. It will be seen that the probes herein are not just database server types, but also probes corresponding to different release versions of the respective database software, as they are also distinguished in terms of communication protocols.
The engine then needs to use the detected type version to perform deep analysis on the database access protocol based on TCP/IP, so that the protocol analysis module needs to maintain a session search tree which has been successfully logged in, record the related information of the database on the session node, and delete the corresponding session node from the tree when logging out.
In the process of database access, the engine receives accessed TCP/IP protocol data, firstly searches for a session node on a corresponding session tree, and extracts SQL sentences from traffic according to the characteristics of the type version of the target database. Then, the engine analyzes the syntax tree of the extracted SQL sentence and classifies the SQL operation into four types of inserting data record, modifying data record, reading data record and other operations. And finally, submitting the SQL operation and the classification to a database execution unit. After the execution unit takes the information, the information is submitted to a filtering policy engine, the filtering policy engine judges whether the operation can be executed or not according to the predefined database security policy and the database security policy, if yes, the execution unit submits the back-end database, otherwise, the filtering policy engine sends alarm information to an administrator.
The background core system database receives the database operation sent by the engine database execution unit and then executes the operation, the execution result is also sent to the engine through TCP/IP protocol flow, the engine also searches the corresponding node on the session search tree, and the TCP/IP protocol flow returned after the database execution operation is carried out according to the obtained database type version.
The database result set analysis module in the engine analyzes the protocol flow to obtain the query results returned by the database, and sends the results to the database desensitization module. The database desensitization module adopts a text search replacement method of a finite state machine according to a filtering strategy engine to process character string asterisks after first words of data related to privacy parts in a result set, such as an identity card number, a mobile phone number, a name, an address and the like, and all numbers and character strings except the first word are changed into asterisks. After the data allergy is processed, the database protocol packaging module packages the returned data according to the detected database type and the protocol according to the session information in the session tree, and returns the data to the IT system. Therefore, the medical institution can transparently complete desensitization operation when the databases among the IT systems are shared and used on the premise that the IT systems do not need to be subjected to any migration transformation, and privacy leakage is effectively prevented.
3. Storage front engine implementation
As shown in FIG. 3, the storage pre-engine, as a filtering engine, also needs to be connected in series between the links of both the IT system and the supporting storage system of the medical facility, all read and write operations on the stored data will pass through the engine. In order not to affect the use of medical IT systems, the engine cannot change the way the original IT system accesses the storage, that is, the engine needs to work transparently between the IT system and the storage system, and the engine determines whether the operation is legal and whether the data needs to be desensitized according to policies.
The data written into the storage system by the IT system of the medical institution is typically image data in the form of a file, the image file being a comprehensive file containing patient information and image information, which is stored in the system storage and later retrieved by the IT system. When the IT system initiates a TCP connection to the storage system through the engine, the engine creates a session node for the connection, establishes a session search tree as with the database pre-engine, allocates a session temporary write operation area for the session, and records the address of the temporary write operation area in the session node, these allocated spatial areas being released at the end of the connection disconnection session.
When an image file is continuously written into the storage system by the IT system, the engine inquires the corresponding session node information through the search tree, acquires the corresponding session temporary writing operation area, and writes the image file stream before temporary writing operation until the image file is completely written. After the image file is completely reserved, the filtering strategy engine is triggered to filter and check the file information, if the file information passes the filtering strategy engine, the file information is written into the system for storage, otherwise, the image file stored in the temporary writing operation space is only read and written for the current session. When the current session initiates the reading operation, firstly, the engine inquires the corresponding session node information through a search tree, acquires the corresponding session temporary writing operation area, then checks whether the target image file exists in the area, if so, places the file into the file data reading buffer, and finally, if not, stores the request file into the system at the rear end and places the file into the file data reading buffer. After the file data is ready in the reading buffer, a filtering strategy engine is triggered to see whether the file content data needs to be desensitized, if so, a text searching and replacing method of a finite state machine is adopted to process character string asterisks after the first words of the data related to the privacy part of the patient information in the file content data.
4. Encryption cloud storage and secure sharing implementation
The cloud storage access engine periodically requests incremental data from the database pre-engine and the storage pre-engine, firstly requests an incremental result set of complete information from the database pre-engine, the obtained incremental result set needs to be subjected to unstructured processing to obtain an unstructured file of the complete information of the result set, then requests an incremental result set of desensitization information from the database pre-engine, and similarly carries out unstructured processing on the obtained incremental result set to obtain the unstructured file of the desensitized result set.
At this time, we exclusive-or the two files by bit, keep the unstructured file after desensitization and the exclusive-or result file by bit, and compress the two files respectively.
The method comprises the steps of requesting incremental data from a storage front engine, similarly to the above process, requesting the obtained image file of complete information and the image file after desensitization, carrying out bit-wise exclusive-or operation on the image file and the image file after desensitization, reserving the image file after desensitization and the bit-wise exclusive-or result file, and respectively compressing the two files.
After the cloud storage access engine periodically acquires the compressed file, the compressed file is divided into two parts, one part is the desensitized part is the result of bitwise exclusive OR, the two parts are sequentially applied for two encryption keys to a key management system by taking the current timestamp and uuid and the like of the device as factors, then the two parts of compressed file are respectively encrypted by the two keys, and the encrypted file is fragmented and uploaded to the cloud storage.
And simultaneously, two certificates are issued by using the private key of the device, the two corresponding key factors are respectively packaged in the two certificates, and the certificates are stored in the device.
So far, we save the incremental data in the cloud storage periodically, even if the data fragments in the cloud storage are stolen under the unauthorized condition, the thief cannot reorganize, decrypt and decompress to obtain the data. However, if the data uploaded before needs to be re-acquired inside the medical institution, only the corresponding data fragments need to be downloaded, then the corresponding certificates are acquired in the device, the key management system is used for acquiring the key decryption decompressed file by using the factors in the certificates, and the complete data can be obtained by performing exclusive-or operation on the desensitized data and the bitwise exclusive-or result data again.
When a third party needs to obtain shared data, the third party is to be mutually trusted with the data owner, that is, to have the public key of the data owner, and our device is also deployed inside the third party.
The third party sharing engine sends a sharing request to the cloud storage access engine in a cloud storage subscription mode, if the request passes the audit, the cloud storage access engine opens read-only permission of corresponding data of the third party sharing engine on the cloud storage, and sends an issuing key factor certificate of the corresponding desensitized data to the third party sharing engine.
After downloading all corresponding desensitization data fragments, the third party engine verifies the certificate by using the public key, requests the key from the key management system by using factors in the certificate, then decrypts the decompressed desensitization data, and stores the desensitization data in the database and the storage system of the user.
It is to be understood that these examples are illustrative of the present application and are not intended to limit the scope of the present application. Furthermore, it should be understood that various changes and modifications can be made by one skilled in the art after reading the teachings of the present application, and such equivalents are intended to fall within the scope of the application as defined in the appended claims.

Claims (12)

1. A system for implementing secure storage and sharing of medical big data, comprising: the system comprises a database pre-engine, a storage pre-engine, a cloud storage access engine, a key management system and a third party data sharing engine;
the database pre-engine and the storage pre-engine are respectively connected in series with the database and the storage link of the existing IT system of the medical institution, and the access of all the IT systems to the database and the storage is filtered;
the cloud storage access engine periodically requests incremental structured data from the database pre-engine, simultaneously requests incremental unstructured data from the storage pre-engine, processes, encapsulates and compresses the data, requests an encryption key from the key management system, encrypts the data and then uploads the encrypted data to the cloud storage;
and the third party data sharing engine submits a sharing request to the cloud storage, the cloud storage access engine receives a message triggered by the request, waits for verification, acquires data from the cloud storage if the verification passes, and writes the data into the third party database and the unstructured write-in third party storage after the structured data and the unstructured data are recombined, decrypted and decompressed.
2. The system for realizing safe storage and sharing of medical big data according to claim 1, wherein the database pre-engine and the storage pre-engine judge whether the request can be submitted to the back-end database and stored according to a pre-configured policy, and judge whether the database and the stored return data need to be desensitized according to the source of the request and the policy.
3. The system for implementing safe storage and sharing of medical big data according to claim 2, wherein said database pre-engine comprises an engine database protocol analysis module capable of analyzing the type of database server and the version of database server software, said database pre-engine uses the detected type version to make deep analysis on the TCP/IP based database access protocol, and the engine database protocol analysis module maintains a session search tree which has been successfully logged in and records database related information on the session nodes, and when logged out, deletes the corresponding session node from the tree.
4. The system for realizing safe storage and sharing of medical big data according to claim 3, wherein the database pre-engine receives accessed TCP/IP protocol data, firstly searches for session nodes on a corresponding session tree, extracts SQL sentences from traffic according to the characteristics of type versions of a target database, then the engine analyzes the syntax tree of the extracted SQL sentences and classifies SQL operations, divides the SQL operations into four types of inserting data records, modifying data records, reading data records and other operations, finally submits the SQL operations and the classifications to a database execution unit, the execution unit takes the information, submits the information to a filtering policy engine first, and the filtering policy engine judges whether the operations can be executed according to a predefined database security policy and a database security policy, if possible, the execution unit submits a back-end database, otherwise, sends alarm information to an administrator.
5. The system for realizing safe storage and sharing of medical big data according to claim 4, wherein the database pre-engine further comprises a database result set analysis module and a database desensitization module, the database result set analysis module analyzes the protocol flow to obtain the query results returned by the database, and sends the results to the database desensitization module;
the database desensitization module adopts a text search replacement method of a finite state machine according to a filtering strategy engine, data related to a privacy part in a result set are subjected to desensitization treatment, and after the desensitization treatment, a database protocol packaging module packages returned data according to a detected database type and a protocol according to session information in a session tree and returns the data to an IT system.
6. A system for implementing secure storage and sharing of medical big data according to any of claims 1-5, characterized in that said storage front engine, upon receiving a TCP connection initiated by IT system to storage system, generates a session node for the connection, establishes a session search tree, allocates a session temporary write operation area for the session, and records the address of the temporary write operation area in the session node, these allocated spatial areas being released at the end of the connection-disconnection session.
7. The system for realizing safe storage and sharing of medical big data according to claim 6, wherein when the integrated image file containing the patient information and the image information is continuously written into the storage system by the IT system, the storage front engine queries the corresponding session node information through the search tree and obtains the corresponding session temporary writing operation area, before writing the file stream into the temporary writing operation until the file is completely written, the filtering policy engine is triggered to filter and check the file information until the file is completely written, and if the file passes the filtering check, the file is concurrently written into the storage system, otherwise, the file stored in the temporary writing operation space is only read and written for the current session.
8. The system for implementing safe storage and sharing of medical big data according to claim 7, wherein when the current session initiates the reading operation, the storage front engine queries the corresponding session node information through the search tree and obtains the corresponding session temporary writing operation area, then checks whether the target file exists in the area, if so, puts the file into the file data reading buffer, if not, stores the requested file into the system at the back end and puts the file into the file data reading buffer, after the file data is prepared in the reading buffer, triggers the filtering policy engine to see whether the desensitization of the file content data is needed, if so, adopts the text searching replacement method of the finite state machine, and desensitizes the data of the privacy part.
9. The system for realizing safe storage and sharing of medical big data according to claim 1, wherein the cloud storage access engine periodically requests incremental data from the database pre-engine and the storage pre-engine, the database pre-engine and the storage pre-engine respectively perform unstructured processing on the obtained incremental result set to obtain an unstructured file with complete information of the result set and an unstructured file with desensitized result set, respectively perform exclusive or processing on two files according to bits, respectively compress the two files, encrypt the two files after the cloud storage access engine periodically acquires the compressed files, and then perform fragmentation and upload the encrypted files to the cloud storage.
10. The system for realizing safe storage and sharing of medical big data according to claim 9, wherein one part of the compressed file is desensitized, one part is bitwise exclusive or result, two encryption keys are applied to the key management system by using the current time stamp and uuid of the device as factors for the two parts in turn, and then the two compressed files are encrypted by using the two keys respectively.
11. The system for realizing safe storage and sharing of medical big data according to claim 10, further comprising a private key, wherein the private key issues two certificates, the two corresponding key factors are respectively encapsulated in the two certificates, and the certificates are stored in the system.
12. The system for realizing safe storage and sharing of medical big data according to claim 11, wherein the third party data sharing engine sends a sharing request to the cloud storage access engine in a subscription mode of cloud storage, if the request passes the audit, the cloud storage access engine opens read-only permission of corresponding data of the third party data sharing engine on the cloud storage, and sends an issuing key factor certificate of the corresponding desensitized data to the third party data sharing engine;
after downloading all corresponding desensitized data fragments, the third party data sharing engine verifies the certificate by using the public key, requests the key from the key management system by using factors in the certificate, then decrypts the decompressed desensitized data, and stores the desensitized data in the database and the storage system of the user.
CN202310440541.5A 2023-04-23 2023-04-23 System for realizing safe storage and sharing of medical big data Active CN116522415B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310440541.5A CN116522415B (en) 2023-04-23 2023-04-23 System for realizing safe storage and sharing of medical big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310440541.5A CN116522415B (en) 2023-04-23 2023-04-23 System for realizing safe storage and sharing of medical big data

Publications (2)

Publication Number Publication Date
CN116522415A CN116522415A (en) 2023-08-01
CN116522415B true CN116522415B (en) 2023-11-07

Family

ID=87404033

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310440541.5A Active CN116522415B (en) 2023-04-23 2023-04-23 System for realizing safe storage and sharing of medical big data

Country Status (1)

Country Link
CN (1) CN116522415B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117150567B (en) * 2023-10-31 2024-01-12 山东省国土空间数据和遥感技术研究院(山东省海域动态监视监测中心) Cross-regional real estate data sharing system
CN117725618B (en) * 2024-02-06 2024-05-07 贵州省邮电规划设计院有限公司 Government affair service analysis management system based on big data

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103116643A (en) * 2013-02-25 2013-05-22 江苏物联网研究发展中心 Hadoop-based intelligent medical data management method
CN103248479A (en) * 2012-02-06 2013-08-14 中兴通讯股份有限公司 Cloud storage safety system, data protection method and data sharing method
JP2019079266A (en) * 2017-10-24 2019-05-23 株式会社Nobori Medical information transfer system and medical information transfer method
CN113987443A (en) * 2021-11-02 2022-01-28 西安邮电大学 Multi-cloud and multi-chain collaborative electronic medical data security sharing method
CN114356971A (en) * 2021-12-02 2022-04-15 阿里巴巴(中国)有限公司 Data processing method, device and system
CN114896622A (en) * 2022-04-13 2022-08-12 复旦大学 Medical data security cloud storage method
CN115168358A (en) * 2022-07-18 2022-10-11 协鑫电港云科技(海南)有限公司 Database access method and device, electronic equipment and storage medium
CN115274126A (en) * 2022-08-11 2022-11-01 西南医科大学附属医院 Medical inspection data sharing system based on big data

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140012833A1 (en) * 2011-09-13 2014-01-09 Hans-Christian Humprecht Protection of data privacy in an enterprise system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103248479A (en) * 2012-02-06 2013-08-14 中兴通讯股份有限公司 Cloud storage safety system, data protection method and data sharing method
CN103116643A (en) * 2013-02-25 2013-05-22 江苏物联网研究发展中心 Hadoop-based intelligent medical data management method
JP2019079266A (en) * 2017-10-24 2019-05-23 株式会社Nobori Medical information transfer system and medical information transfer method
CN113987443A (en) * 2021-11-02 2022-01-28 西安邮电大学 Multi-cloud and multi-chain collaborative electronic medical data security sharing method
CN114356971A (en) * 2021-12-02 2022-04-15 阿里巴巴(中国)有限公司 Data processing method, device and system
CN114896622A (en) * 2022-04-13 2022-08-12 复旦大学 Medical data security cloud storage method
CN115168358A (en) * 2022-07-18 2022-10-11 协鑫电港云科技(海南)有限公司 Database access method and device, electronic equipment and storage medium
CN115274126A (en) * 2022-08-11 2022-11-01 西南医科大学附属医院 Medical inspection data sharing system based on big data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于云计算的区域医疗信息数据共享平台的设计与实现;唐维维;《中国人民解放军医学院》;全文 *
基于云计算的大型医疗信息共享平台设计;林则等;《中国医学装备》;全文 *

Also Published As

Publication number Publication date
CN116522415A (en) 2023-08-01

Similar Documents

Publication Publication Date Title
CN116522415B (en) System for realizing safe storage and sharing of medical big data
CN111400401B (en) Electronic medical record storage system based on block chain
Chen et al. A blockchain-based medical data sharing mechanism with attribute-based access control and privacy protection
JP5711840B1 (en) Kernel program, method and apparatus incorporating relational database
TWI388183B (en) System and method for dis-identifying sensitive information and associated records
AU2022204191B2 (en) Self-consistent structures for secure transmission and temporary storage of sensitive data
TW201814511A (en) Nuts
US20090210250A1 (en) Intermediation Server, A Method, And Network For Consulting And Referencing Medical Information
US20110289310A1 (en) Cloud computing appliance
CN111816271A (en) Block chain-based electronic medical record sharing method and system and readable storage medium
CN110010213A (en) Electronic health record storage method, system, device, equipment and readable storage medium storing program for executing
US8504590B2 (en) Methods of encapsulating information in records from two or more disparate databases
WO2018009979A1 (en) A computer implemented method for secure management of data generated in an ehr during an episode of care and a system therefor
CN115274126A (en) Medical inspection data sharing system based on big data
JP5337334B2 (en) Create and transmit secure reports of data
CN111370118A (en) Diagnosis and treatment safety analysis method and device for cross-medical institution and computer equipment
Yongjoh et al. Development of an internet-of-healthcare system using blockchain
JP2005524168A (en) Storage of confidential information
Teng et al. Mobile ultrasound with DICOM and cloud connectivity
EP4352738A1 (en) Personalized data graphs including user domain concepts
Li et al. Range query in blockchain-based data sharing model for electronic medical records
Halim et al. Decentralized Children's Immunization Record Management System for Private Healthcare in Malaysia Using IPFS and Blockchain
Koutsoukos et al. Emergency Health Protocols Supporting Health Data Exchange, Cloud Storage, and Indexing.
Pleskach et al. Mechanisms for Encrypting Big Unstructured Data: Technical and Legal Aspects
Suh et al. Web-based medical image archive system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant