CN117150565A - Medical data desensitization storage method and device, electronic equipment and storage medium - Google Patents
Medical data desensitization storage method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN117150565A CN117150565A CN202311422604.0A CN202311422604A CN117150565A CN 117150565 A CN117150565 A CN 117150565A CN 202311422604 A CN202311422604 A CN 202311422604A CN 117150565 A CN117150565 A CN 117150565A
- Authority
- CN
- China
- Prior art keywords
- data
- storage
- desensitization
- medical
- medical data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000586 desensitisation Methods 0.000 title claims abstract description 159
- 238000000034 method Methods 0.000 title claims abstract description 86
- 238000013500 data storage Methods 0.000 claims abstract description 47
- 238000012545 processing Methods 0.000 claims abstract description 24
- 238000004891 communication Methods 0.000 claims description 17
- 238000013503 de-identification Methods 0.000 claims description 14
- 230000005540 biological transmission Effects 0.000 claims description 13
- 238000004590 computer program Methods 0.000 claims description 10
- 238000007689 inspection Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000003745 diagnosis Methods 0.000 description 2
- 238000005265 energy consumption Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000002604 ultrasonography Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
- G06F21/6254—Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/602—Providing cryptographic facilities or services
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H10/00—ICT specially adapted for the handling or processing of patient-related medical or healthcare data
- G16H10/60—ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/104—Peer-to-peer [P2P] networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
Abstract
The invention discloses a medical data desensitization storage method, a device, electronic equipment and a storage medium, and relates to the technical field of medical data desensitization storage. Acquiring medical data to be stored, and identifying data content contained in the medical data to determine a security policy of the medical data; the security policy comprises a data desensitization mode and a data storage mode; performing data desensitization processing on the medical data according to the security policy to obtain de-identified medical data; and determining storage nodes of the medical data on the distributed database IPFS according to the security policy, and storing an index generated during data storage on a preset medical data blockchain. By identifying the specific content of the medical data, executing different security policies and different data desensitizing modes and storage modes, the security of sensitive privacy information is ensured, and the efficiency of data storage is also ensured.
Description
Technical Field
The invention relates to the technical field of medical data desensitization storage, in particular to a medical data desensitization storage method, a device, electronic equipment and a storage medium.
Background
New technologies such as 5G, artificial intelligence application and the like enable medical services to walk to true intelligentization. The efficiency of medical diagnosis work is greatly improved by remote medical treatment, AI auxiliary diagnosis, remote first aid, remote consultation, robot ultrasound and the like.
But the large amount of medical data which is generated therewith is complex in category, large in data amount, long in data storage years, and large in data amount which is increased each year. The traditional data center mainly uses magnetic storage, and the problems of high data operation energy consumption, complex equipment operation and maintenance, multiple data migration according to the annual requirement and the like are more remarkable. In addition, the medical data usually record privacy data of patients, and a large amount of sensitive information leakage, personal privacy information infringement and other data security events can be caused in the processes of data acquisition, transmission, sharing and the like of multiple mechanisms.
Disclosure of Invention
The invention aims to solve the problems of the background technology and provides a medical data desensitization storage method, a device, an electronic device and a storage medium.
The aim of the invention can be achieved by the following technical scheme:
according to a first aspect of an embodiment of the present invention, there is provided a medical data desensitization storage method, the method including:
acquiring medical data to be stored, and identifying data content contained in the medical data to determine a security policy of the medical data; the security policy comprises a data desensitization mode and a data storage mode;
performing data desensitization processing on the medical data according to the security policy to obtain de-identified medical data;
and determining a storage node of the medical data on the distributed database IPFS according to the security policy, and storing an index generated during data storage on a preset medical data blockchain.
Optionally, identifying the data content contained in the medical data to determine the security policy of the medical data includes:
identifying the data content contained in the medical data, if the data content only contains basic information of a patient, determining that the data desensitization mode is text data desensitization, and the data storage mode is continuous storage; the basic information comprises at least one of basic information of a patient, outpatient records, emergency records, inpatient records, inspection records, prescription records, operation records and medical insurance data;
if the data content also contains the medical image data of the patient, determining that the data desensitization mode is mixed desensitization of text data and image data, and the data storage mode is discrete storage.
Optionally, performing data desensitization processing on the medical data according to the security policy, and obtaining the de-identified medical data includes:
if the data desensitization mode is text data desensitization, scanning the medical data to determine that the data in a preset field is sensitive data, and performing desensitization treatment on the sensitive data by using a preset desensitization method to obtain de-identified medical data; the preset desensitization method comprises at least one of a rule-based desensitization method, an encryption desensitization method, a camouflage desensitization method, a data disturbance desensitization method and a data shielding desensitization method;
if the data desensitization mode is that text data and image data are mixed and desensitized, dividing the medical data into basic information and medical image data, scanning the basic information to determine that data in a preset field is sensitive data, performing desensitization processing on the sensitive data by using a preset desensitization method to obtain first medical data, acquiring metadata and a data format of the medical image data, performing desensitization processing on the metadata by using a preset desensitization method, determining the types of data elements of the medical image data according to the data format, performing corresponding de-identification operation according to the element types to obtain second medical data, and combining the first medical data and the second medical data to serve as de-identification medical data; the de-identification operation includes clearing the value of the attribute item, overriding the value of the attribute item, and deleting the attribute item.
Optionally, determining the storage node of the medical data on the distributed database IPFS according to the security policy comprises:
calculating the storage cost of the storage nodes of the medical data on the distributed database IPFS:wherein P is the storage cost, B is the transmission bandwidth, S is the storage space size of the storage node, alpha is a preset constant, D is the distance between the server where the medical data is located and the storage node, and T is the transmission delay;
ordering all the storage nodes according to the order from small storage cost to large storage cost to obtain a node list;
if the data storage mode is continuous storage, selecting the first node in the node list as a storage node of the medical data on a distributed database IPFS;
and if the data storage mode is discrete storage, selecting a first node and a second node in the node list as storage nodes of the medical data on the distributed database IPFS, and respectively storing the first medical data and the second medical data.
In a second aspect of the embodiment of the present invention, there is also provided a medical data desensitization storage device, the device including:
the identification module is used for acquiring medical data to be stored, identifying data content contained in the medical data and determining a safety strategy of the medical data; the security policy comprises a data desensitization mode and a data storage mode;
the desensitization module is used for carrying out data desensitization processing on the medical data according to the security policy to obtain de-identified medical data;
and the storage module is used for determining storage nodes of the medical data on the distributed database IPFS according to the security policy and storing indexes generated during data storage on a preset medical data blockchain.
Optionally, the identification module includes:
the first identification sub-module is used for identifying the data content contained in the medical data, and if the data content only contains basic information of a patient, the data desensitization mode is determined to be text data desensitization, and the data storage mode is continuous storage; the basic information comprises at least one of basic information of a patient, outpatient records, emergency records, inpatient records, inspection records, prescription records, operation records and medical insurance data;
and the second identification sub-module is used for determining that the data desensitization mode is mixed desensitization of text data and image data if the data content also contains medical image data of a patient, and the data storage mode is discrete storage.
Optionally, the desensitizing module includes:
the text data desensitization module is used for scanning the medical data to determine that the data in the preset field is sensitive data if the data desensitization mode is text data desensitization, and carrying out desensitization processing on the sensitive data by using a preset desensitization method to obtain de-identified medical data; the preset desensitization method comprises at least one of a rule-based desensitization method, an encryption desensitization method, a camouflage desensitization method, a data disturbance desensitization method and a data shielding desensitization method;
the mixed desensitization module is used for dividing the medical data into the basic information and the medical image data if the data desensitization mode is the mixed desensitization of the text data and the image data, scanning the basic information to determine that the data in a preset field is sensitive data, performing desensitization processing on the sensitive data by using a preset desensitization method to obtain first medical data, acquiring metadata and a data format of the medical image data, performing desensitization processing on the metadata by using a preset desensitization method, determining the types of data elements of the medical image data according to the data format, performing corresponding de-identification operation according to the element types to obtain second medical data, and combining the first medical data and the second medical data to be used as the de-identified medical data; the de-identification operation includes clearing the value of the attribute item, overriding the value of the attribute item, and deleting the attribute item.
Optionally, the storage module includes:
a calculation module, configured to calculate a storage cost of the medical data in a storage node on the distributed database IPFS:wherein P is the storage cost, B is the transmission bandwidth, S is the storage space size of the storage node, alpha is a preset constant, D is the distance between the server where the medical data is located and the storage node, and T is the transmission delay;
the ordering module is used for ordering the storage nodes according to the order from small storage cost to large storage cost to obtain a node list;
the first storage sub-module is used for selecting the first node in the node list as a storage node of the medical data on the distributed database IPFS if the data storage mode is continuous storage;
and the second storage sub-module is used for selecting a first node and a second node in the node list as storage nodes of the medical data on the distributed database IPFS if the data storage mode is discrete storage, and respectively storing the first medical data and the second medical data.
The third aspect of the embodiment of the invention also provides an electronic device, which is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface and the memory are communicated with each other through the communication bus;
a memory for storing a computer program;
and a processor for implementing any of the above-described method steps when executing a program stored on the memory.
In a fourth aspect of the embodiment of the present invention, there is further provided a computer readable storage medium, where a computer program is stored, where the computer program when executed by a processor implements any of the above method steps.
The invention has the beneficial effects that:
the embodiment of the invention provides a medical data desensitization storage method, which comprises the steps of obtaining medical data to be stored, identifying data content contained in the medical data and determining a safety strategy of the medical data; the security policy comprises a data desensitization mode and a data storage mode; performing data desensitization processing on the medical data according to the security policy to obtain de-identified medical data; and determining storage nodes of the medical data on the distributed database IPFS according to the security policy, and storing an index generated during data storage on a preset medical data blockchain. By identifying the specific content of the medical data, executing different security policies and different data desensitizing modes and storage modes, the security of sensitive privacy information is ensured, and the efficiency of data storage is also ensured.
Drawings
The invention is further described below with reference to the accompanying drawings.
FIG. 1 is a flow chart of a method for desensitizing storage of medical data according to an embodiment of the present invention;
FIG. 2 is a block diagram of a medical data desensitizing memory device according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The embodiment of the invention provides a medical data desensitization storage method. Referring to fig. 1, fig. 1 is a flowchart of a medical data desensitization storage method according to an embodiment of the present invention. The method comprises the following steps:
s1, acquiring medical data to be stored, and identifying data content contained in the medical data to determine a security policy of the medical data; the security policy comprises a data desensitization mode and a data storage mode;
s2, performing data desensitization treatment on the medical data according to the security policy to obtain de-identified medical data;
and S3, determining storage nodes of the medical data on the distributed database IPFS according to the security policy, and storing an index generated during data storage on a preset medical data blockchain.
By identifying the specific content of the medical data, executing different security policies and different data desensitizing modes and storage modes, the security of sensitive privacy information is ensured, and the efficiency of data storage is also ensured.
In one implementation, the distributed storage of medical data may reduce costs such as data operation energy consumption, equipment operation and maintenance costs.
In one embodiment, identifying the data content contained in the medical data to determine the security policy of the medical data includes:
identifying data content contained in medical data, if the data content only contains basic information of a patient, determining that a data desensitization mode is text data desensitization, and a data storage mode is continuous storage; the basic information includes at least one of basic information of the patient, outpatient records, emergency records, hospitalization records, inspection records, prescription records, operation records, and medical insurance data;
if the data content also contains the medical image data of the patient, determining that the data desensitization mode is mixed desensitization of the text data and the image data, and the data storage mode is discrete storage.
In one implementation, the patient's underlying information contains explicit sensitive privacy information, typically in text form, while the medical image data contains not only explicit sensitive privacy information, but also implicit sensitive privacy information, e.g., specific attribute items contained in the medical image data.
In one embodiment, data desensitizing the medical data according to a security policy to obtain de-identified medical data comprises:
if the data desensitization mode is text data desensitization, scanning medical data to determine that the data in a preset field is sensitive data, and performing desensitization treatment on the sensitive data by using a preset desensitization method to obtain de-identified medical data; the preset desensitization method comprises at least one of a rule-based desensitization method, an encryption desensitization method, a camouflage desensitization method, a data disturbance desensitization method and a data shielding desensitization method;
if the data desensitization mode is that text data and image data are mixed and desensitized, medical data are divided into basic information and medical image data, the basic information is scanned to determine that data in a preset field are sensitive data, the sensitive data are desensitized by using a preset desensitization method to obtain first medical data, metadata and data formats of the medical image data are obtained, the metadata are desensitized by using the preset desensitization method, the types of data elements of the medical image data are determined according to the data formats, corresponding de-identification operation is performed according to the element types to obtain second medical data, and the first medical data and the second medical data are combined to be used as de-identified medical data; the de-identification operation includes clearing the value of the attribute item, overriding the value of the attribute item, and deleting the attribute item.
In one embodiment, determining a storage node for medical data on a distributed database IPFS according to a security policy includes:
calculating the storage cost of the storage nodes of the medical data on the distributed database IPFS:wherein P is the storage cost, B is the transmission bandwidth, S is the storage space size of the storage node, alpha is a preset constant, D is the distance between the server where the medical data is located and the storage node, and T is the transmission delay;
ordering all the storage nodes according to the order from small storage cost to large storage cost to obtain a node list;
if the data storage mode is continuous storage, selecting the first node in the node list as a storage node of medical data on the distributed database IPFS;
if the data storage mode is discrete storage, selecting a first node and a second node in the node list as storage nodes of medical data on the distributed database IPFS, and respectively storing the first medical data and the second medical data.
The embodiment of the invention provides a medical data desensitization storage device. Referring to fig. 2, fig. 2 is a block diagram of a medical data desensitizing memory device according to an embodiment of the present invention. The device comprises:
the identification module is used for acquiring the medical data to be stored, identifying the data content contained in the medical data and determining the safety strategy of the medical data; the security policy comprises a data desensitization mode and a data storage mode;
the desensitization module is used for carrying out data desensitization processing on the medical data according to the security policy to obtain de-identified medical data;
and the storage module is used for determining storage nodes of the medical data on the distributed database IPFS according to the security policy and storing indexes generated during data storage on a preset medical data blockchain.
In one embodiment, the identification module comprises:
the first identification sub-module is used for identifying the data content contained in the medical data, and if the data content only contains basic information of a patient, determining that the data desensitization mode is text data desensitization and the data storage mode is continuous storage; the basic information includes at least one of basic information of the patient, outpatient records, emergency records, hospitalization records, inspection records, prescription records, operation records, and medical insurance data;
and the second identification sub-module is used for determining that the data desensitization mode is mixed desensitization of the text data and the image data if the data content also contains the medical image data of the patient, and the data storage mode is discrete storage.
In one embodiment, the desensitizing module comprises:
the text data desensitization module is used for scanning the medical data to determine that the data in the preset field is sensitive data if the data desensitization mode is text data desensitization, and carrying out desensitization treatment on the sensitive data by using a preset desensitization method to obtain de-identified medical data; the preset desensitization method comprises at least one of a rule-based desensitization method, an encryption desensitization method, a camouflage desensitization method, a data disturbance desensitization method and a data shielding desensitization method;
the mixed desensitization module is used for dividing medical data into basic information and medical image data if the data desensitization mode is text data and image data mixed desensitization, scanning the basic information to determine that data in a preset field is sensitive data, performing desensitization processing on the sensitive data by using a preset desensitization method to obtain first medical data, acquiring metadata and a data format of the medical image data, performing desensitization processing on the metadata by using the preset desensitization method, determining each data element type of the medical image data according to the data format, performing corresponding de-identification operation according to the element type to obtain second medical data, and combining the first medical data and the second medical data to be used as de-identified medical data; the de-identification operation includes clearing the value of the attribute item, overriding the value of the attribute item, and deleting the attribute item.
In one embodiment, a memory module includes:
a calculation module for calculating a storage cost of the medical data at the storage nodes on the distributed database IPFS:wherein P is the storage cost, B is the transmission bandwidth, S is the storage space size of the storage node, alpha is a preset constant, D is the distance between the server where the medical data is located and the storage node, and T is the transmission delay;
the ordering module is used for ordering the storage nodes according to the order from small storage cost to large storage cost to obtain a node list;
the first storage sub-module is used for selecting the first one of the node lists as a storage node of the medical data on the distributed database IPFS if the data storage mode is continuous storage;
and the second storage sub-module is used for selecting the first and second nodes in the node list as storage nodes of the medical data on the distributed database IPFS if the data storage mode is discrete storage, and respectively storing the first medical data and the second medical data.
The embodiment of the present invention further provides an electronic device, as shown in fig. 3, including a processor 301, a communication interface 302, a memory 303, and a communication bus 304, where the processor 301, the communication interface 302, and the memory 303 perform communication with each other through the communication bus 304,
a memory 303 for storing a computer program;
the processor 301 is configured to execute the program stored in the memory 303, and implement the following steps:
acquiring medical data to be stored, and identifying data content contained in the medical data to determine a security policy of the medical data; the security policy comprises a data desensitization mode and a data storage mode;
performing data desensitization processing on the medical data according to the security policy to obtain de-identified medical data;
and determining a storage node of the medical data on the distributed database IPFS according to the security policy, and storing an index generated during data storage on a preset medical data blockchain.
The communication bus mentioned above for the electronic devices may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, etc. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
The communication interface is used for communication between the electronic device and other devices.
The Memory may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.
The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
In yet another embodiment of the present invention, there is also provided a computer readable storage medium having stored therein a computer program which when executed by a processor implements the steps of any of the medical data desensitization storage methods described above.
In yet another embodiment of the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform any of the medical data desensitization storage methods of the above embodiments.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), etc.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for the system, the electronic device and the storage medium, the description is relatively simple, as it is substantially similar to the method embodiments, and the relevant points are referred to in the partial description of the method embodiments.
The foregoing describes one embodiment of the present invention in detail, but the description is only a preferred embodiment of the present invention and should not be construed as limiting the scope of the invention. All equivalent changes and modifications within the scope of the present invention are intended to be covered by the present invention.
Claims (10)
1. A method of desensitizing storage of medical data, the method comprising:
acquiring medical data to be stored, and identifying data content contained in the medical data to determine a security policy of the medical data; the security policy comprises a data desensitization mode and a data storage mode;
performing data desensitization processing on the medical data according to the security policy to obtain de-identified medical data;
and determining a storage node of the medical data on the distributed database IPFS according to the security policy, and storing an index generated during data storage on a preset medical data blockchain.
2. The method of claim 1, wherein identifying the data content contained in the medical data to determine the security policy of the medical data comprises:
identifying the data content contained in the medical data, if the data content only contains basic information of a patient, determining that the data desensitization mode is text data desensitization, and the data storage mode is continuous storage; the basic information comprises at least one of basic information of a patient, outpatient records, emergency records, inpatient records, inspection records, prescription records, operation records and medical insurance data;
if the data content also contains the medical image data of the patient, determining that the data desensitization mode is mixed desensitization of text data and image data, and the data storage mode is discrete storage.
3. The method of claim 2, wherein performing data desensitization processing on the medical data according to the security policy to obtain de-identified medical data comprises:
if the data desensitization mode is text data desensitization, scanning the medical data to determine that the data in a preset field is sensitive data, and performing desensitization treatment on the sensitive data by using a preset desensitization method to obtain de-identified medical data; the preset desensitization method comprises at least one of a rule-based desensitization method, an encryption desensitization method, a camouflage desensitization method, a data disturbance desensitization method and a data shielding desensitization method;
if the data desensitization mode is that text data and image data are mixed and desensitized, dividing the medical data into basic information and medical image data, scanning the basic information to determine that data in a preset field is sensitive data, performing desensitization processing on the sensitive data by using a preset desensitization method to obtain first medical data, acquiring metadata and a data format of the medical image data, performing desensitization processing on the metadata by using a preset desensitization method, determining the types of data elements of the medical image data according to the data format, performing corresponding de-identification operation according to the element types to obtain second medical data, and combining the first medical data and the second medical data to serve as de-identification medical data; the de-identification operation includes clearing the value of the attribute item, overriding the value of the attribute item, and deleting the attribute item.
4. A method of desensitizing storage of medical data according to claim 3, wherein determining storage nodes of said medical data on a distributed database IPFS according to said security policies comprises:
calculating the storage cost of the storage nodes of the medical data on the distributed database IPFS:wherein P is the storage cost, B is the transmission bandwidth, S is the storage space size of the storage node, alpha is a preset constant, D is the distance between the server where the medical data is located and the storage node, and T is the transmission delay;
ordering all the storage nodes according to the order from small storage cost to large storage cost to obtain a node list;
if the data storage mode is continuous storage, selecting the first node in the node list as a storage node of the medical data on a distributed database IPFS;
and if the data storage mode is discrete storage, selecting a first node and a second node in the node list as storage nodes of the medical data on the distributed database IPFS, and respectively storing the first medical data and the second medical data.
5. A medical data desensitizing storage device, the device comprising:
the identification module is used for acquiring medical data to be stored, identifying data content contained in the medical data and determining a safety strategy of the medical data; the security policy comprises a data desensitization mode and a data storage mode;
the desensitization module is used for carrying out data desensitization processing on the medical data according to the security policy to obtain de-identified medical data;
and the storage module is used for determining storage nodes of the medical data on the distributed database IPFS according to the security policy and storing indexes generated during data storage on a preset medical data blockchain.
6. The medical data desensitizing storage device according to claim 5, wherein said identification module comprises:
the first identification sub-module is used for identifying the data content contained in the medical data, and if the data content only contains basic information of a patient, the data desensitization mode is determined to be text data desensitization, and the data storage mode is continuous storage; the basic information comprises at least one of basic information of a patient, outpatient records, emergency records, inpatient records, inspection records, prescription records, operation records and medical insurance data;
and the second identification sub-module is used for determining that the data desensitization mode is mixed desensitization of text data and image data if the data content also contains medical image data of a patient, and the data storage mode is discrete storage.
7. The medical data desensitizing storage device according to claim 6, wherein said desensitizing module comprises:
the text data desensitization module is used for scanning the medical data to determine that the data in the preset field is sensitive data if the data desensitization mode is text data desensitization, and carrying out desensitization processing on the sensitive data by using a preset desensitization method to obtain de-identified medical data; the preset desensitization method comprises at least one of a rule-based desensitization method, an encryption desensitization method, a camouflage desensitization method, a data disturbance desensitization method and a data shielding desensitization method;
the mixed desensitization module is used for dividing the medical data into the basic information and the medical image data if the data desensitization mode is the mixed desensitization of the text data and the image data, scanning the basic information to determine that the data in a preset field is sensitive data, performing desensitization processing on the sensitive data by using a preset desensitization method to obtain first medical data, acquiring metadata and a data format of the medical image data, performing desensitization processing on the metadata by using a preset desensitization method, determining the types of data elements of the medical image data according to the data format, performing corresponding de-identification operation according to the element types to obtain second medical data, and combining the first medical data and the second medical data to be used as the de-identified medical data; the de-identification operation includes clearing the value of the attribute item, overriding the value of the attribute item, and deleting the attribute item.
8. The medical data desensitizing storage device according to claim 7, wherein the storage module comprises:
a calculation module, configured to calculate a storage cost of the medical data in a storage node on the distributed database IPFS:wherein P is the storage cost, B is the transmission bandwidth, S is the storage space size of the storage node, alpha is a preset constant, D is the distance between the server where the medical data is located and the storage node, and T is the transmission delay;
the ordering module is used for ordering the storage nodes according to the order from small storage cost to large storage cost to obtain a node list;
the first storage sub-module is used for selecting the first node in the node list as a storage node of the medical data on the distributed database IPFS if the data storage mode is continuous storage;
and the second storage sub-module is used for selecting a first node and a second node in the node list as storage nodes of the medical data on the distributed database IPFS if the data storage mode is discrete storage, and respectively storing the first medical data and the second medical data.
9. The electronic equipment is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
a memory for storing a computer program;
a processor for carrying out the method steps of any one of claims 1-4 when executing a program stored on a memory.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored therein a computer program which, when executed by a processor, implements the method steps of any of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311422604.0A CN117150565B (en) | 2023-10-31 | 2023-10-31 | Medical data desensitization storage method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311422604.0A CN117150565B (en) | 2023-10-31 | 2023-10-31 | Medical data desensitization storage method and device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117150565A true CN117150565A (en) | 2023-12-01 |
CN117150565B CN117150565B (en) | 2024-03-01 |
Family
ID=88906562
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311422604.0A Active CN117150565B (en) | 2023-10-31 | 2023-10-31 | Medical data desensitization storage method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117150565B (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107239666A (en) * | 2017-06-09 | 2017-10-10 | 孟群 | A kind of method and system that medical imaging data are carried out with desensitization process |
US20190354693A1 (en) * | 2018-05-17 | 2019-11-21 | International Business Machines Corporation | Blockchain for managing access to medical data |
CN114528591A (en) * | 2022-02-16 | 2022-05-24 | 平安国际智慧城市科技股份有限公司 | Data management method, device, server and storage medium |
KR20220091926A (en) * | 2020-12-24 | 2022-07-01 | 김은정 | Apparatus and method unidentifying personal information for providing inter-hospital transfer service for patient |
CN115664694A (en) * | 2022-08-25 | 2023-01-31 | 四川澳丁医疗科技有限公司 | Secure processing and transmission method based on DICOM file, client and server |
CN116011023A (en) * | 2023-01-30 | 2023-04-25 | 医渡云(北京)技术有限公司 | Data desensitization processing method and device, terminal equipment and storage medium |
CN116304186A (en) * | 2023-02-03 | 2023-06-23 | 江苏斯普德科技有限公司 | Post-structuring processing method and post-structuring processing system for medical document |
CN116415298A (en) * | 2023-03-31 | 2023-07-11 | 中国医学科学院北京协和医院 | Medical data desensitization method and system |
CN116486981A (en) * | 2023-06-15 | 2023-07-25 | 北京中科江南信息技术股份有限公司 | Method for storing health data and method and device for reading health data |
CN116579010A (en) * | 2023-04-14 | 2023-08-11 | 聊城市人民医院 | Safety application method, equipment and storage medium for medical sensitive data |
CN116756750A (en) * | 2023-05-25 | 2023-09-15 | 广东精点数据科技股份有限公司 | Medical sensitive data acquisition desensitization method |
-
2023
- 2023-10-31 CN CN202311422604.0A patent/CN117150565B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107239666A (en) * | 2017-06-09 | 2017-10-10 | 孟群 | A kind of method and system that medical imaging data are carried out with desensitization process |
US20190354693A1 (en) * | 2018-05-17 | 2019-11-21 | International Business Machines Corporation | Blockchain for managing access to medical data |
KR20220091926A (en) * | 2020-12-24 | 2022-07-01 | 김은정 | Apparatus and method unidentifying personal information for providing inter-hospital transfer service for patient |
CN114528591A (en) * | 2022-02-16 | 2022-05-24 | 平安国际智慧城市科技股份有限公司 | Data management method, device, server and storage medium |
CN115664694A (en) * | 2022-08-25 | 2023-01-31 | 四川澳丁医疗科技有限公司 | Secure processing and transmission method based on DICOM file, client and server |
CN116011023A (en) * | 2023-01-30 | 2023-04-25 | 医渡云(北京)技术有限公司 | Data desensitization processing method and device, terminal equipment and storage medium |
CN116304186A (en) * | 2023-02-03 | 2023-06-23 | 江苏斯普德科技有限公司 | Post-structuring processing method and post-structuring processing system for medical document |
CN116415298A (en) * | 2023-03-31 | 2023-07-11 | 中国医学科学院北京协和医院 | Medical data desensitization method and system |
CN116579010A (en) * | 2023-04-14 | 2023-08-11 | 聊城市人民医院 | Safety application method, equipment and storage medium for medical sensitive data |
CN116756750A (en) * | 2023-05-25 | 2023-09-15 | 广东精点数据科技股份有限公司 | Medical sensitive data acquisition desensitization method |
CN116486981A (en) * | 2023-06-15 | 2023-07-25 | 北京中科江南信息技术股份有限公司 | Method for storing health data and method and device for reading health data |
Also Published As
Publication number | Publication date |
---|---|
CN117150565B (en) | 2024-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cochrane Public Health Group et al. | Digital contact tracing technologies in epidemics: a rapid review | |
US10725981B1 (en) | Analyzing big data | |
Ansari et al. | P-STORE: Extension of STORE methodology to elicit privacy requirements | |
US9361320B1 (en) | Modeling big data | |
US9047488B2 (en) | Anonymizing sensitive identifying information based on relational context across a group | |
US20160028732A1 (en) | Search engine with privacy protection | |
US11042668B1 (en) | System for preparing data for expert certification and monitoring data over time to ensure compliance with certified boundary conditions | |
US20170277907A1 (en) | Abstracted Graphs from Social Relationship Graph | |
US8793215B2 (en) | Systems and methods for publishing datasets | |
Arroyo-Machado et al. | Science through Wikipedia: A novel representation of open knowledge through co-citation networks | |
WO2019237541A1 (en) | Method and apparatus for determining contact label, and terminal device and medium | |
US20200183916A1 (en) | Multidimensional Multitenant System | |
CN109522705B (en) | Authority management method, device, electronic equipment and medium | |
GB2553869A (en) | System and method for secure analysis of datasets | |
CN117150565B (en) | Medical data desensitization storage method and device, electronic equipment and storage medium | |
CN113010494A (en) | Database auditing method and device and database proxy server | |
US20190130000A1 (en) | Querying of profile data by reducing unnecessary downstream calls | |
US20230153455A1 (en) | Query-based database redaction | |
CN107194278B (en) | A kind of data generaliza-tion method based on Skyline | |
US11748515B2 (en) | System and method for secure linking of anonymized data | |
US20220147651A1 (en) | Data management method, non-transitory computer readable medium, and data management system | |
Yao et al. | Phase I control chart for individual autocorrelated data: application to prescription opioid monitoring | |
CN111694993B (en) | Method, device, electronic equipment and medium for creating data index | |
CN114356885A (en) | Intelligent matching method for scientific and technological service projects, storage medium and equipment | |
JP7133714B2 (en) | Storage and structured retrieval of historical security data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |