CN117150565A - Medical data desensitization storage method and device, electronic equipment and storage medium - Google Patents

Medical data desensitization storage method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117150565A
CN117150565A CN202311422604.0A CN202311422604A CN117150565A CN 117150565 A CN117150565 A CN 117150565A CN 202311422604 A CN202311422604 A CN 202311422604A CN 117150565 A CN117150565 A CN 117150565A
Authority
CN
China
Prior art keywords
data
storage
desensitization
medical
medical data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311422604.0A
Other languages
Chinese (zh)
Other versions
CN117150565B (en
Inventor
李静
卢国栋
宋丙华
王峰
李滋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Wangan Security Technology Co ltd
Original Assignee
Shandong Wangan Security Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Wangan Security Technology Co ltd filed Critical Shandong Wangan Security Technology Co ltd
Priority to CN202311422604.0A priority Critical patent/CN117150565B/en
Publication of CN117150565A publication Critical patent/CN117150565A/en
Application granted granted Critical
Publication of CN117150565B publication Critical patent/CN117150565B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6254Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Abstract

The invention discloses a medical data desensitization storage method, a device, electronic equipment and a storage medium, and relates to the technical field of medical data desensitization storage. Acquiring medical data to be stored, and identifying data content contained in the medical data to determine a security policy of the medical data; the security policy comprises a data desensitization mode and a data storage mode; performing data desensitization processing on the medical data according to the security policy to obtain de-identified medical data; and determining storage nodes of the medical data on the distributed database IPFS according to the security policy, and storing an index generated during data storage on a preset medical data blockchain. By identifying the specific content of the medical data, executing different security policies and different data desensitizing modes and storage modes, the security of sensitive privacy information is ensured, and the efficiency of data storage is also ensured.

Description

Medical data desensitization storage method and device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of medical data desensitization storage, in particular to a medical data desensitization storage method, a device, electronic equipment and a storage medium.
Background
New technologies such as 5G, artificial intelligence application and the like enable medical services to walk to true intelligentization. The efficiency of medical diagnosis work is greatly improved by remote medical treatment, AI auxiliary diagnosis, remote first aid, remote consultation, robot ultrasound and the like.
But the large amount of medical data which is generated therewith is complex in category, large in data amount, long in data storage years, and large in data amount which is increased each year. The traditional data center mainly uses magnetic storage, and the problems of high data operation energy consumption, complex equipment operation and maintenance, multiple data migration according to the annual requirement and the like are more remarkable. In addition, the medical data usually record privacy data of patients, and a large amount of sensitive information leakage, personal privacy information infringement and other data security events can be caused in the processes of data acquisition, transmission, sharing and the like of multiple mechanisms.
Disclosure of Invention
The invention aims to solve the problems of the background technology and provides a medical data desensitization storage method, a device, an electronic device and a storage medium.
The aim of the invention can be achieved by the following technical scheme:
according to a first aspect of an embodiment of the present invention, there is provided a medical data desensitization storage method, the method including:
acquiring medical data to be stored, and identifying data content contained in the medical data to determine a security policy of the medical data; the security policy comprises a data desensitization mode and a data storage mode;
performing data desensitization processing on the medical data according to the security policy to obtain de-identified medical data;
and determining a storage node of the medical data on the distributed database IPFS according to the security policy, and storing an index generated during data storage on a preset medical data blockchain.
Optionally, identifying the data content contained in the medical data to determine the security policy of the medical data includes:
identifying the data content contained in the medical data, if the data content only contains basic information of a patient, determining that the data desensitization mode is text data desensitization, and the data storage mode is continuous storage; the basic information comprises at least one of basic information of a patient, outpatient records, emergency records, inpatient records, inspection records, prescription records, operation records and medical insurance data;
if the data content also contains the medical image data of the patient, determining that the data desensitization mode is mixed desensitization of text data and image data, and the data storage mode is discrete storage.
Optionally, performing data desensitization processing on the medical data according to the security policy, and obtaining the de-identified medical data includes:
if the data desensitization mode is text data desensitization, scanning the medical data to determine that the data in a preset field is sensitive data, and performing desensitization treatment on the sensitive data by using a preset desensitization method to obtain de-identified medical data; the preset desensitization method comprises at least one of a rule-based desensitization method, an encryption desensitization method, a camouflage desensitization method, a data disturbance desensitization method and a data shielding desensitization method;
if the data desensitization mode is that text data and image data are mixed and desensitized, dividing the medical data into basic information and medical image data, scanning the basic information to determine that data in a preset field is sensitive data, performing desensitization processing on the sensitive data by using a preset desensitization method to obtain first medical data, acquiring metadata and a data format of the medical image data, performing desensitization processing on the metadata by using a preset desensitization method, determining the types of data elements of the medical image data according to the data format, performing corresponding de-identification operation according to the element types to obtain second medical data, and combining the first medical data and the second medical data to serve as de-identification medical data; the de-identification operation includes clearing the value of the attribute item, overriding the value of the attribute item, and deleting the attribute item.
Optionally, determining the storage node of the medical data on the distributed database IPFS according to the security policy comprises:
calculating the storage cost of the storage nodes of the medical data on the distributed database IPFS:wherein P is the storage cost, B is the transmission bandwidth, S is the storage space size of the storage node, alpha is a preset constant, D is the distance between the server where the medical data is located and the storage node, and T is the transmission delay;
ordering all the storage nodes according to the order from small storage cost to large storage cost to obtain a node list;
if the data storage mode is continuous storage, selecting the first node in the node list as a storage node of the medical data on a distributed database IPFS;
and if the data storage mode is discrete storage, selecting a first node and a second node in the node list as storage nodes of the medical data on the distributed database IPFS, and respectively storing the first medical data and the second medical data.
In a second aspect of the embodiment of the present invention, there is also provided a medical data desensitization storage device, the device including:
the identification module is used for acquiring medical data to be stored, identifying data content contained in the medical data and determining a safety strategy of the medical data; the security policy comprises a data desensitization mode and a data storage mode;
the desensitization module is used for carrying out data desensitization processing on the medical data according to the security policy to obtain de-identified medical data;
and the storage module is used for determining storage nodes of the medical data on the distributed database IPFS according to the security policy and storing indexes generated during data storage on a preset medical data blockchain.
Optionally, the identification module includes:
the first identification sub-module is used for identifying the data content contained in the medical data, and if the data content only contains basic information of a patient, the data desensitization mode is determined to be text data desensitization, and the data storage mode is continuous storage; the basic information comprises at least one of basic information of a patient, outpatient records, emergency records, inpatient records, inspection records, prescription records, operation records and medical insurance data;
and the second identification sub-module is used for determining that the data desensitization mode is mixed desensitization of text data and image data if the data content also contains medical image data of a patient, and the data storage mode is discrete storage.
Optionally, the desensitizing module includes:
the text data desensitization module is used for scanning the medical data to determine that the data in the preset field is sensitive data if the data desensitization mode is text data desensitization, and carrying out desensitization processing on the sensitive data by using a preset desensitization method to obtain de-identified medical data; the preset desensitization method comprises at least one of a rule-based desensitization method, an encryption desensitization method, a camouflage desensitization method, a data disturbance desensitization method and a data shielding desensitization method;
the mixed desensitization module is used for dividing the medical data into the basic information and the medical image data if the data desensitization mode is the mixed desensitization of the text data and the image data, scanning the basic information to determine that the data in a preset field is sensitive data, performing desensitization processing on the sensitive data by using a preset desensitization method to obtain first medical data, acquiring metadata and a data format of the medical image data, performing desensitization processing on the metadata by using a preset desensitization method, determining the types of data elements of the medical image data according to the data format, performing corresponding de-identification operation according to the element types to obtain second medical data, and combining the first medical data and the second medical data to be used as the de-identified medical data; the de-identification operation includes clearing the value of the attribute item, overriding the value of the attribute item, and deleting the attribute item.
Optionally, the storage module includes:
a calculation module, configured to calculate a storage cost of the medical data in a storage node on the distributed database IPFS:wherein P is the storage cost, B is the transmission bandwidth, S is the storage space size of the storage node, alpha is a preset constant, D is the distance between the server where the medical data is located and the storage node, and T is the transmission delay;
the ordering module is used for ordering the storage nodes according to the order from small storage cost to large storage cost to obtain a node list;
the first storage sub-module is used for selecting the first node in the node list as a storage node of the medical data on the distributed database IPFS if the data storage mode is continuous storage;
and the second storage sub-module is used for selecting a first node and a second node in the node list as storage nodes of the medical data on the distributed database IPFS if the data storage mode is discrete storage, and respectively storing the first medical data and the second medical data.
The third aspect of the embodiment of the invention also provides an electronic device, which is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface and the memory are communicated with each other through the communication bus;
a memory for storing a computer program;
and a processor for implementing any of the above-described method steps when executing a program stored on the memory.
In a fourth aspect of the embodiment of the present invention, there is further provided a computer readable storage medium, where a computer program is stored, where the computer program when executed by a processor implements any of the above method steps.
The invention has the beneficial effects that:
the embodiment of the invention provides a medical data desensitization storage method, which comprises the steps of obtaining medical data to be stored, identifying data content contained in the medical data and determining a safety strategy of the medical data; the security policy comprises a data desensitization mode and a data storage mode; performing data desensitization processing on the medical data according to the security policy to obtain de-identified medical data; and determining storage nodes of the medical data on the distributed database IPFS according to the security policy, and storing an index generated during data storage on a preset medical data blockchain. By identifying the specific content of the medical data, executing different security policies and different data desensitizing modes and storage modes, the security of sensitive privacy information is ensured, and the efficiency of data storage is also ensured.
Drawings
The invention is further described below with reference to the accompanying drawings.
FIG. 1 is a flow chart of a method for desensitizing storage of medical data according to an embodiment of the present invention;
FIG. 2 is a block diagram of a medical data desensitizing memory device according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The embodiment of the invention provides a medical data desensitization storage method. Referring to fig. 1, fig. 1 is a flowchart of a medical data desensitization storage method according to an embodiment of the present invention. The method comprises the following steps:
s1, acquiring medical data to be stored, and identifying data content contained in the medical data to determine a security policy of the medical data; the security policy comprises a data desensitization mode and a data storage mode;
s2, performing data desensitization treatment on the medical data according to the security policy to obtain de-identified medical data;
and S3, determining storage nodes of the medical data on the distributed database IPFS according to the security policy, and storing an index generated during data storage on a preset medical data blockchain.
By identifying the specific content of the medical data, executing different security policies and different data desensitizing modes and storage modes, the security of sensitive privacy information is ensured, and the efficiency of data storage is also ensured.
In one implementation, the distributed storage of medical data may reduce costs such as data operation energy consumption, equipment operation and maintenance costs.
In one embodiment, identifying the data content contained in the medical data to determine the security policy of the medical data includes:
identifying data content contained in medical data, if the data content only contains basic information of a patient, determining that a data desensitization mode is text data desensitization, and a data storage mode is continuous storage; the basic information includes at least one of basic information of the patient, outpatient records, emergency records, hospitalization records, inspection records, prescription records, operation records, and medical insurance data;
if the data content also contains the medical image data of the patient, determining that the data desensitization mode is mixed desensitization of the text data and the image data, and the data storage mode is discrete storage.
In one implementation, the patient's underlying information contains explicit sensitive privacy information, typically in text form, while the medical image data contains not only explicit sensitive privacy information, but also implicit sensitive privacy information, e.g., specific attribute items contained in the medical image data.
In one embodiment, data desensitizing the medical data according to a security policy to obtain de-identified medical data comprises:
if the data desensitization mode is text data desensitization, scanning medical data to determine that the data in a preset field is sensitive data, and performing desensitization treatment on the sensitive data by using a preset desensitization method to obtain de-identified medical data; the preset desensitization method comprises at least one of a rule-based desensitization method, an encryption desensitization method, a camouflage desensitization method, a data disturbance desensitization method and a data shielding desensitization method;
if the data desensitization mode is that text data and image data are mixed and desensitized, medical data are divided into basic information and medical image data, the basic information is scanned to determine that data in a preset field are sensitive data, the sensitive data are desensitized by using a preset desensitization method to obtain first medical data, metadata and data formats of the medical image data are obtained, the metadata are desensitized by using the preset desensitization method, the types of data elements of the medical image data are determined according to the data formats, corresponding de-identification operation is performed according to the element types to obtain second medical data, and the first medical data and the second medical data are combined to be used as de-identified medical data; the de-identification operation includes clearing the value of the attribute item, overriding the value of the attribute item, and deleting the attribute item.
In one embodiment, determining a storage node for medical data on a distributed database IPFS according to a security policy includes:
calculating the storage cost of the storage nodes of the medical data on the distributed database IPFS:wherein P is the storage cost, B is the transmission bandwidth, S is the storage space size of the storage node, alpha is a preset constant, D is the distance between the server where the medical data is located and the storage node, and T is the transmission delay;
ordering all the storage nodes according to the order from small storage cost to large storage cost to obtain a node list;
if the data storage mode is continuous storage, selecting the first node in the node list as a storage node of medical data on the distributed database IPFS;
if the data storage mode is discrete storage, selecting a first node and a second node in the node list as storage nodes of medical data on the distributed database IPFS, and respectively storing the first medical data and the second medical data.
The embodiment of the invention provides a medical data desensitization storage device. Referring to fig. 2, fig. 2 is a block diagram of a medical data desensitizing memory device according to an embodiment of the present invention. The device comprises:
the identification module is used for acquiring the medical data to be stored, identifying the data content contained in the medical data and determining the safety strategy of the medical data; the security policy comprises a data desensitization mode and a data storage mode;
the desensitization module is used for carrying out data desensitization processing on the medical data according to the security policy to obtain de-identified medical data;
and the storage module is used for determining storage nodes of the medical data on the distributed database IPFS according to the security policy and storing indexes generated during data storage on a preset medical data blockchain.
In one embodiment, the identification module comprises:
the first identification sub-module is used for identifying the data content contained in the medical data, and if the data content only contains basic information of a patient, determining that the data desensitization mode is text data desensitization and the data storage mode is continuous storage; the basic information includes at least one of basic information of the patient, outpatient records, emergency records, hospitalization records, inspection records, prescription records, operation records, and medical insurance data;
and the second identification sub-module is used for determining that the data desensitization mode is mixed desensitization of the text data and the image data if the data content also contains the medical image data of the patient, and the data storage mode is discrete storage.
In one embodiment, the desensitizing module comprises:
the text data desensitization module is used for scanning the medical data to determine that the data in the preset field is sensitive data if the data desensitization mode is text data desensitization, and carrying out desensitization treatment on the sensitive data by using a preset desensitization method to obtain de-identified medical data; the preset desensitization method comprises at least one of a rule-based desensitization method, an encryption desensitization method, a camouflage desensitization method, a data disturbance desensitization method and a data shielding desensitization method;
the mixed desensitization module is used for dividing medical data into basic information and medical image data if the data desensitization mode is text data and image data mixed desensitization, scanning the basic information to determine that data in a preset field is sensitive data, performing desensitization processing on the sensitive data by using a preset desensitization method to obtain first medical data, acquiring metadata and a data format of the medical image data, performing desensitization processing on the metadata by using the preset desensitization method, determining each data element type of the medical image data according to the data format, performing corresponding de-identification operation according to the element type to obtain second medical data, and combining the first medical data and the second medical data to be used as de-identified medical data; the de-identification operation includes clearing the value of the attribute item, overriding the value of the attribute item, and deleting the attribute item.
In one embodiment, a memory module includes:
a calculation module for calculating a storage cost of the medical data at the storage nodes on the distributed database IPFS:wherein P is the storage cost, B is the transmission bandwidth, S is the storage space size of the storage node, alpha is a preset constant, D is the distance between the server where the medical data is located and the storage node, and T is the transmission delay;
the ordering module is used for ordering the storage nodes according to the order from small storage cost to large storage cost to obtain a node list;
the first storage sub-module is used for selecting the first one of the node lists as a storage node of the medical data on the distributed database IPFS if the data storage mode is continuous storage;
and the second storage sub-module is used for selecting the first and second nodes in the node list as storage nodes of the medical data on the distributed database IPFS if the data storage mode is discrete storage, and respectively storing the first medical data and the second medical data.
The embodiment of the present invention further provides an electronic device, as shown in fig. 3, including a processor 301, a communication interface 302, a memory 303, and a communication bus 304, where the processor 301, the communication interface 302, and the memory 303 perform communication with each other through the communication bus 304,
a memory 303 for storing a computer program;
the processor 301 is configured to execute the program stored in the memory 303, and implement the following steps:
acquiring medical data to be stored, and identifying data content contained in the medical data to determine a security policy of the medical data; the security policy comprises a data desensitization mode and a data storage mode;
performing data desensitization processing on the medical data according to the security policy to obtain de-identified medical data;
and determining a storage node of the medical data on the distributed database IPFS according to the security policy, and storing an index generated during data storage on a preset medical data blockchain.
The communication bus mentioned above for the electronic devices may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, etc. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
The communication interface is used for communication between the electronic device and other devices.
The Memory may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.
The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
In yet another embodiment of the present invention, there is also provided a computer readable storage medium having stored therein a computer program which when executed by a processor implements the steps of any of the medical data desensitization storage methods described above.
In yet another embodiment of the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform any of the medical data desensitization storage methods of the above embodiments.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), etc.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for the system, the electronic device and the storage medium, the description is relatively simple, as it is substantially similar to the method embodiments, and the relevant points are referred to in the partial description of the method embodiments.
The foregoing describes one embodiment of the present invention in detail, but the description is only a preferred embodiment of the present invention and should not be construed as limiting the scope of the invention. All equivalent changes and modifications within the scope of the present invention are intended to be covered by the present invention.

Claims (10)

1. A method of desensitizing storage of medical data, the method comprising:
acquiring medical data to be stored, and identifying data content contained in the medical data to determine a security policy of the medical data; the security policy comprises a data desensitization mode and a data storage mode;
performing data desensitization processing on the medical data according to the security policy to obtain de-identified medical data;
and determining a storage node of the medical data on the distributed database IPFS according to the security policy, and storing an index generated during data storage on a preset medical data blockchain.
2. The method of claim 1, wherein identifying the data content contained in the medical data to determine the security policy of the medical data comprises:
identifying the data content contained in the medical data, if the data content only contains basic information of a patient, determining that the data desensitization mode is text data desensitization, and the data storage mode is continuous storage; the basic information comprises at least one of basic information of a patient, outpatient records, emergency records, inpatient records, inspection records, prescription records, operation records and medical insurance data;
if the data content also contains the medical image data of the patient, determining that the data desensitization mode is mixed desensitization of text data and image data, and the data storage mode is discrete storage.
3. The method of claim 2, wherein performing data desensitization processing on the medical data according to the security policy to obtain de-identified medical data comprises:
if the data desensitization mode is text data desensitization, scanning the medical data to determine that the data in a preset field is sensitive data, and performing desensitization treatment on the sensitive data by using a preset desensitization method to obtain de-identified medical data; the preset desensitization method comprises at least one of a rule-based desensitization method, an encryption desensitization method, a camouflage desensitization method, a data disturbance desensitization method and a data shielding desensitization method;
if the data desensitization mode is that text data and image data are mixed and desensitized, dividing the medical data into basic information and medical image data, scanning the basic information to determine that data in a preset field is sensitive data, performing desensitization processing on the sensitive data by using a preset desensitization method to obtain first medical data, acquiring metadata and a data format of the medical image data, performing desensitization processing on the metadata by using a preset desensitization method, determining the types of data elements of the medical image data according to the data format, performing corresponding de-identification operation according to the element types to obtain second medical data, and combining the first medical data and the second medical data to serve as de-identification medical data; the de-identification operation includes clearing the value of the attribute item, overriding the value of the attribute item, and deleting the attribute item.
4. A method of desensitizing storage of medical data according to claim 3, wherein determining storage nodes of said medical data on a distributed database IPFS according to said security policies comprises:
calculating the storage cost of the storage nodes of the medical data on the distributed database IPFS:wherein P is the storage cost, B is the transmission bandwidth, S is the storage space size of the storage node, alpha is a preset constant, D is the distance between the server where the medical data is located and the storage node, and T is the transmission delay;
ordering all the storage nodes according to the order from small storage cost to large storage cost to obtain a node list;
if the data storage mode is continuous storage, selecting the first node in the node list as a storage node of the medical data on a distributed database IPFS;
and if the data storage mode is discrete storage, selecting a first node and a second node in the node list as storage nodes of the medical data on the distributed database IPFS, and respectively storing the first medical data and the second medical data.
5. A medical data desensitizing storage device, the device comprising:
the identification module is used for acquiring medical data to be stored, identifying data content contained in the medical data and determining a safety strategy of the medical data; the security policy comprises a data desensitization mode and a data storage mode;
the desensitization module is used for carrying out data desensitization processing on the medical data according to the security policy to obtain de-identified medical data;
and the storage module is used for determining storage nodes of the medical data on the distributed database IPFS according to the security policy and storing indexes generated during data storage on a preset medical data blockchain.
6. The medical data desensitizing storage device according to claim 5, wherein said identification module comprises:
the first identification sub-module is used for identifying the data content contained in the medical data, and if the data content only contains basic information of a patient, the data desensitization mode is determined to be text data desensitization, and the data storage mode is continuous storage; the basic information comprises at least one of basic information of a patient, outpatient records, emergency records, inpatient records, inspection records, prescription records, operation records and medical insurance data;
and the second identification sub-module is used for determining that the data desensitization mode is mixed desensitization of text data and image data if the data content also contains medical image data of a patient, and the data storage mode is discrete storage.
7. The medical data desensitizing storage device according to claim 6, wherein said desensitizing module comprises:
the text data desensitization module is used for scanning the medical data to determine that the data in the preset field is sensitive data if the data desensitization mode is text data desensitization, and carrying out desensitization processing on the sensitive data by using a preset desensitization method to obtain de-identified medical data; the preset desensitization method comprises at least one of a rule-based desensitization method, an encryption desensitization method, a camouflage desensitization method, a data disturbance desensitization method and a data shielding desensitization method;
the mixed desensitization module is used for dividing the medical data into the basic information and the medical image data if the data desensitization mode is the mixed desensitization of the text data and the image data, scanning the basic information to determine that the data in a preset field is sensitive data, performing desensitization processing on the sensitive data by using a preset desensitization method to obtain first medical data, acquiring metadata and a data format of the medical image data, performing desensitization processing on the metadata by using a preset desensitization method, determining the types of data elements of the medical image data according to the data format, performing corresponding de-identification operation according to the element types to obtain second medical data, and combining the first medical data and the second medical data to be used as the de-identified medical data; the de-identification operation includes clearing the value of the attribute item, overriding the value of the attribute item, and deleting the attribute item.
8. The medical data desensitizing storage device according to claim 7, wherein the storage module comprises:
a calculation module, configured to calculate a storage cost of the medical data in a storage node on the distributed database IPFS:wherein P is the storage cost, B is the transmission bandwidth, S is the storage space size of the storage node, alpha is a preset constant, D is the distance between the server where the medical data is located and the storage node, and T is the transmission delay;
the ordering module is used for ordering the storage nodes according to the order from small storage cost to large storage cost to obtain a node list;
the first storage sub-module is used for selecting the first node in the node list as a storage node of the medical data on the distributed database IPFS if the data storage mode is continuous storage;
and the second storage sub-module is used for selecting a first node and a second node in the node list as storage nodes of the medical data on the distributed database IPFS if the data storage mode is discrete storage, and respectively storing the first medical data and the second medical data.
9. The electronic equipment is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
a memory for storing a computer program;
a processor for carrying out the method steps of any one of claims 1-4 when executing a program stored on a memory.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored therein a computer program which, when executed by a processor, implements the method steps of any of claims 1-4.
CN202311422604.0A 2023-10-31 2023-10-31 Medical data desensitization storage method and device, electronic equipment and storage medium Active CN117150565B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311422604.0A CN117150565B (en) 2023-10-31 2023-10-31 Medical data desensitization storage method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311422604.0A CN117150565B (en) 2023-10-31 2023-10-31 Medical data desensitization storage method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN117150565A true CN117150565A (en) 2023-12-01
CN117150565B CN117150565B (en) 2024-03-01

Family

ID=88906562

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311422604.0A Active CN117150565B (en) 2023-10-31 2023-10-31 Medical data desensitization storage method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117150565B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239666A (en) * 2017-06-09 2017-10-10 孟群 A kind of method and system that medical imaging data are carried out with desensitization process
US20190354693A1 (en) * 2018-05-17 2019-11-21 International Business Machines Corporation Blockchain for managing access to medical data
CN114528591A (en) * 2022-02-16 2022-05-24 平安国际智慧城市科技股份有限公司 Data management method, device, server and storage medium
KR20220091926A (en) * 2020-12-24 2022-07-01 김은정 Apparatus and method unidentifying personal information for providing inter-hospital transfer service for patient
CN115664694A (en) * 2022-08-25 2023-01-31 四川澳丁医疗科技有限公司 Secure processing and transmission method based on DICOM file, client and server
CN116011023A (en) * 2023-01-30 2023-04-25 医渡云(北京)技术有限公司 Data desensitization processing method and device, terminal equipment and storage medium
CN116304186A (en) * 2023-02-03 2023-06-23 江苏斯普德科技有限公司 Post-structuring processing method and post-structuring processing system for medical document
CN116415298A (en) * 2023-03-31 2023-07-11 中国医学科学院北京协和医院 Medical data desensitization method and system
CN116486981A (en) * 2023-06-15 2023-07-25 北京中科江南信息技术股份有限公司 Method for storing health data and method and device for reading health data
CN116579010A (en) * 2023-04-14 2023-08-11 聊城市人民医院 Safety application method, equipment and storage medium for medical sensitive data
CN116756750A (en) * 2023-05-25 2023-09-15 广东精点数据科技股份有限公司 Medical sensitive data acquisition desensitization method

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239666A (en) * 2017-06-09 2017-10-10 孟群 A kind of method and system that medical imaging data are carried out with desensitization process
US20190354693A1 (en) * 2018-05-17 2019-11-21 International Business Machines Corporation Blockchain for managing access to medical data
KR20220091926A (en) * 2020-12-24 2022-07-01 김은정 Apparatus and method unidentifying personal information for providing inter-hospital transfer service for patient
CN114528591A (en) * 2022-02-16 2022-05-24 平安国际智慧城市科技股份有限公司 Data management method, device, server and storage medium
CN115664694A (en) * 2022-08-25 2023-01-31 四川澳丁医疗科技有限公司 Secure processing and transmission method based on DICOM file, client and server
CN116011023A (en) * 2023-01-30 2023-04-25 医渡云(北京)技术有限公司 Data desensitization processing method and device, terminal equipment and storage medium
CN116304186A (en) * 2023-02-03 2023-06-23 江苏斯普德科技有限公司 Post-structuring processing method and post-structuring processing system for medical document
CN116415298A (en) * 2023-03-31 2023-07-11 中国医学科学院北京协和医院 Medical data desensitization method and system
CN116579010A (en) * 2023-04-14 2023-08-11 聊城市人民医院 Safety application method, equipment and storage medium for medical sensitive data
CN116756750A (en) * 2023-05-25 2023-09-15 广东精点数据科技股份有限公司 Medical sensitive data acquisition desensitization method
CN116486981A (en) * 2023-06-15 2023-07-25 北京中科江南信息技术股份有限公司 Method for storing health data and method and device for reading health data

Also Published As

Publication number Publication date
CN117150565B (en) 2024-03-01

Similar Documents

Publication Publication Date Title
Cochrane Public Health Group et al. Digital contact tracing technologies in epidemics: a rapid review
US10725981B1 (en) Analyzing big data
Ansari et al. P-STORE: Extension of STORE methodology to elicit privacy requirements
US9361320B1 (en) Modeling big data
US9047488B2 (en) Anonymizing sensitive identifying information based on relational context across a group
US20160028732A1 (en) Search engine with privacy protection
US11042668B1 (en) System for preparing data for expert certification and monitoring data over time to ensure compliance with certified boundary conditions
US20170277907A1 (en) Abstracted Graphs from Social Relationship Graph
US8793215B2 (en) Systems and methods for publishing datasets
Arroyo-Machado et al. Science through Wikipedia: A novel representation of open knowledge through co-citation networks
WO2019237541A1 (en) Method and apparatus for determining contact label, and terminal device and medium
US20200183916A1 (en) Multidimensional Multitenant System
CN109522705B (en) Authority management method, device, electronic equipment and medium
GB2553869A (en) System and method for secure analysis of datasets
CN117150565B (en) Medical data desensitization storage method and device, electronic equipment and storage medium
CN113010494A (en) Database auditing method and device and database proxy server
US20190130000A1 (en) Querying of profile data by reducing unnecessary downstream calls
US20230153455A1 (en) Query-based database redaction
CN107194278B (en) A kind of data generaliza-tion method based on Skyline
US11748515B2 (en) System and method for secure linking of anonymized data
US20220147651A1 (en) Data management method, non-transitory computer readable medium, and data management system
Yao et al. Phase I control chart for individual autocorrelated data: application to prescription opioid monitoring
CN111694993B (en) Method, device, electronic equipment and medium for creating data index
CN114356885A (en) Intelligent matching method for scientific and technological service projects, storage medium and equipment
JP7133714B2 (en) Storage and structured retrieval of historical security data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant