CN115859372B - Medical data desensitization method and system - Google Patents

Medical data desensitization method and system Download PDF

Info

Publication number
CN115859372B
CN115859372B CN202310199626.9A CN202310199626A CN115859372B CN 115859372 B CN115859372 B CN 115859372B CN 202310199626 A CN202310199626 A CN 202310199626A CN 115859372 B CN115859372 B CN 115859372B
Authority
CN
China
Prior art keywords
data
information
medical
text
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310199626.9A
Other languages
Chinese (zh)
Other versions
CN115859372A (en
Inventor
李睿
胡其桐
刘瑞华
郑名扬
唐学文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Angels Biomedical Technology Co ltd
Original Assignee
Chengdu Angels Biomedical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Angels Biomedical Technology Co ltd filed Critical Chengdu Angels Biomedical Technology Co ltd
Priority to CN202310199626.9A priority Critical patent/CN115859372B/en
Publication of CN115859372A publication Critical patent/CN115859372A/en
Application granted granted Critical
Publication of CN115859372B publication Critical patent/CN115859372B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention belongs to the technical field of data processing, and discloses a medical data desensitizing method and a system, wherein the medical data desensitizing method comprises the following steps: classifying the acquired medical data into text data and non-text data; extracting keywords in text data, retaining original texts of non-keywords, and taking the extracted keywords and the non-text data as data to be desensitized; classifying the data to be desensitized into personal identity information, personal medical information, date information, address information and other information; and desensitizing the classified information. The medical data desensitization system includes: the system comprises a data classification module, a sensitive word extraction module, a field classification module and a data desensitization module. The medical data desensitization method and the system provided by the invention can complete full-automatic desensitization of the medical data, and a user only needs to input medical fields contained in the medical data; desensitization may be performed with respect to the multiplexed medical data.

Description

Medical data desensitization method and system
Technical Field
The invention belongs to the technical field of data processing, and particularly relates to a medical data desensitizing method and system.
Background
Information related to the personal characteristics of a large number of patients in medical data, such as patient's name, contact phone, birth place, life track, illness description, etc., needs privacy protection because once compromised, injury to the patient can occur. The existing text data desensitization algorithm is based on a pattern matching mechanism, and keywords in a static matching text are removed for processing. This can lead to three problems:
1. aiming at privacy information with weaker regularity such as names, accurate matching cannot be achieved. Such as: once the keyword "sheet" appears in the text, the two characters of the "sheet" and thereafter are deleted as the name of the person. However, if the text information is "varicose vein phenomenon is obvious", the method can incorrectly treat the "varicose vein phenomenon" as a name of a person;
2. sensitive data cannot be dynamically judged based on context information. Such as: the word "people hospitals in south county of Sichuan province" may appear in the patient's disease description, and the existing method can cover and code "south county" as sensitive information, so that only "people hospitals in Sichuan province" are left; however, the 'southern county people hospital in Sichuan province' does not relate to the information of the birth place of the patient and does not need to be subjected to desensitization treatment; in addition, "people hospitals in Sichuan province" can be confused with a plurality of people hospitals in Sichuan province when the reference is unknown;
3. static matching rules require desensitizers to exhaust all possible data formats of sensitive information in advance, but may be omitted in the face of text data with diversified forms. Such as: the patient's condition may be described as having a patient visit date of "ju-6 of 2023", which is not in standard format: 2023, 5, 6, 2023/5/6, 2023/05/06, thus are difficult to statically match.
Disclosure of Invention
The present invention aims to solve the above technical problems at least to some extent. To this end, the present invention aims to provide a method and a system for desensitizing medical data.
The technical scheme adopted by the invention is as follows:
a method of desensitizing medical data comprising the steps of:
s1, classifying acquired medical data into text data and non-text data;
s2, extracting keywords in the text data, retaining the original text of non-keywords, and taking the extracted keywords and the non-text data as data to be desensitized;
s3, classifying the data to be desensitized into personal identity information, personal medical information, date information, address information and other information;
s4, desensitizing the classified information: the personal identity information is encrypted, the personal medical information and the date information are subjected to blurring processing, the address information is subjected to mask covering processing to obtain text description, and other information is subjected to original text retaining processing.
Preferably, in step S2, the Pointer Network model improves the Attention mechanism of the BERT model based on the transducer framework to obtain a BERT-Pointer Network model; the BERT-Pointer Network model converts text information into word vectors based on context information and extracts keywords in the text data.
Preferably, step S3 includes: the BERT model converts text information into word vectors based on context information; the PCA model carries out principal component decomposition on the output result of the BERT model, combines similar medical fields, and deletes irrelevant medical fields; and clustering the output result of the PCA model by using a clustering algorithm.
Preferably, the cosine distance between the new data and the four types of data including personal identity information, personal medical information, date information and address information is judged through a clustering algorithm, and if the distance between the new data and one type of data is nearest and is lower than a preset threshold value, the new data is distributed to the type of data; and if the distances between the new data and the four types of data are larger than the preset threshold value, marking the new data as other information.
Preferably, in step S1, classification of text data and non-text data is performed according to each field name of medical data.
A medical data desensitization system, comprising:
the data classification module is used for classifying the acquired medical data into text data and non-text data;
the sensitive word extraction module is used for extracting keywords in the text data, sending the extracted keywords into the field classification module and retaining the original text of the non-keywords;
the field classification module is used for classifying the non-text data and the keywords into personal identity information, personal medical information, date information, address information and other information;
the data desensitization module is used for carrying out desensitization treatment on the information classified by the field classification module: the personal identity information is encrypted, the personal medical information and the date information are subjected to blurring processing, the address information is subjected to mask covering processing to obtain text description, and other information is subjected to original text retaining processing.
Preferably, the sensitive word extraction module includes:
the Pointer Network model is used for improving the Attention mechanism of the BERT model based on the transducer framework to obtain a BERT-Pointer Network model;
the BERT-Pointer Network model is used for converting text information into word vectors and extracting keywords based on context information.
Preferably, the field classification module includes:
the BERT model is used for converting text information into word vectors based on the context information;
the PCA model is used for carrying out principal component decomposition on the output result of the BERT model, merging similar medical fields and deleting irrelevant medical fields;
and the clustering model is used for clustering the output result of the PCA model.
Preferably, the medical data desensitization system further comprises: and the output module is used for outputting the desensitized information to the original position of the medical data.
The beneficial effects of the invention are as follows:
1. the medical data desensitization method and the system provided by the invention can complete full-automatic desensitization of the medical data, and a user only needs to input medical fields contained in the medical data; desensitization may be performed with respect to multiplexed medical data (including both textual and non-textual data).
2. The BERT-Pointer Network model adopted by the invention extracts the sensitive keywords of the medical text data for desensitization. The BERT model is optimized by the model, and sensitive keywords can be extracted by better combining with contextual medical information. Compared with the traditional mode recognition algorithm, the recognition accuracy is improved by 81%, and the recognition speed is improved by 13 times; compared with the BERT model, the recognition accuracy is improved by 22%.
Drawings
FIG. 1 is a schematic block diagram of a medical data desensitization system of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made more apparent and fully hereinafter with reference to the accompanying drawings, in which some, but not all embodiments of the invention are shown. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should also be appreciated that in the embodiments, the functions/acts may occur in a different order than the figures. For example, two figures shown in succession may in fact be executed substantially concurrently or the figures may sometimes be executed in the reverse order, depending upon the functionality/acts involved.
As shown in fig. 1, a method for desensitizing medical data according to the present embodiment includes the steps of:
s1, classifying the acquired medical data into text data and non-text data according to the names of all fields; the medical data shown in table 1, wherein the fields corresponding to the field names "name", "gender", "birth date", "identification card number", "body temperature" and "blood pressure" are non-text data, and the field corresponding to the "illness description" is text data.
Table 1 raw medical data record table
S2, converting text information into word vectors based on context information by using a BERT-Pointer Network model, extracting keywords in text data, reserving original texts of non-keywords, and taking the extracted keywords and the non-text data as data to be desensitized; as shown in table 1, the extracted keywords are "Zhang Jiang", "2018, 3 month, 14 days", and "3 month, 21 days".
In particular, for medical data, some text may have multiple cut patterns. Such as: "people hospitals in south county of Sichuan province" can be divided into "people hospitals in south county of Sichuan province" and "people hospitals" for coding, and "people hospitals in south county of Sichuan province" can also be directly coded as a whole. In order to better process medical text information with complex structure, the BERT model is optimized, and the method is expanded by using a Pointer Network model. The BERT model is mainly based on an Attention mechanism, the Pointer Network model is improved on the Attention mechanism based on a transducer framework to obtain the BERT-Pointer Network model, and the problem that the traditional seq2seq framework cannot solve the problem that an output sequence changes along with the change of the length of an input sequence is solved, so that the new Attention mechanism can better combine with a context to encode, and the problem of label overlapping can be solved.
The conventional Attention architecture is as follows
Figure SMS_1
Figure SMS_2
Figure SMS_3
Wherein e j ,d i Is the state quantity, v, W 1 、W 2 In order to learn the parameters of the model,
Figure SMS_4
is the weight. While the Pointer Network model makes a simplification above this, discarding the third layer weights, taking the result of softmax as assuming a Pointer role pointing to a specific element of the input sequence.
The improved Attention mechanism formula is as follows:
Figure SMS_5
Figure SMS_6
;/>
the output sequence of the Pointer Network model is derived from the input sequence, the range of i is the length of the output sequence, the maximum range is preset, and the vector
Figure SMS_7
An Attention mask representing the jth input vector; t represents matrix transposition, e j 、d i Is the state quantity, v, W 1 、W 2 Is a learning parameter; c (C) 1 、C i-1 、C i All are random variables representing a certain item in an input sequence, and p is a super-parameter representing joint probability distribution; />
Figure SMS_8
The conditional probability of occurrence of item i is known as item i-1 above.
The BERT-Pointer Network model can be used for coding at two layers, namely 'the southern county of Sichuan province' and the people's hospitals' as a whole, and the 'the southern county of Sichuan province' and the people's hospitals' respectively, so that the problem of segmentation ambiguity of medical text data is solved.
S3, classifying personal identification information (such as name and ID card number), personal medical information (such as age), date information, address information and other information of the data to be desensitized.
Specifically, the BERT model converts text information into word vectors based on context information; and a certain initialization process is performed, and the BERT model is mainly based on an Attention mechanism, so that the model can be parallelized and operated, and can have global information. Wherein the Attention function is defined as follows:
Figure SMS_9
where Q represents input information, which is information that the input text exists. K represents content information, which is semantic information, then the attribute (Q, K) represents the matching degree of Query and Key, and V represents the information itself, which is mainly used for weighting the matching degree. The BERT model also considers the position information, fully considers the influence of the context on the result, and the output of the BERT model contains probability distribution in the same type of text information.
The PCA model carries out principal component decomposition on the output result of the BERT model, combines similar medical fields, and deletes irrelevant medical fields; for example, the output result obtained after the BERT model processing includes: age, date of birth, phone number, time of patient visit, etc. Then, after the PCA model processing, the "phone number" is deleted as noise by the PCA module because it is irrelevant to the medical service, and the "age" and "date of birth" are automatically merged by the PCA module as similar medical fields.
The principal component decomposition of the PCA model aims to find a set of orthonormal basis such that the distance between data is maximized after the data points are projected on a plane formed by the orthonormal basis
Figure SMS_10
Wherein the method comprises the steps of
Figure SMS_11
As data, the maximum value of the dual problem of the problem obtained by using the lagrangian multiplier method is as follows:
Figure SMS_12
wherein the method comprises the steps of
Figure SMS_13
And selecting a projection matrix formed by feature vectors corresponding to the first k feature values for the feature values of the covariance matrix after sequencing, and extracting the principal components of the word vectors.
And clustering the output result of the PCA model by using a single-pass clustering text online clustering algorithm. First, for four classifications of medical data: the personal identity information, the personal medical information, the date information and the address information are used for providing training data of each category, and after the training data are processed by using the BERT-PCA model, space vector representation of the data and the category to which the data belong are obtained. The single-pass clustering text online clustering algorithm judges cosine distances between the new data and the four types of data, namely personal identity information, personal medical information, date information and address information, and if the distance between the new data and one type of data is nearest and is lower than a preset threshold value, the new data is distributed to the data of the type; and if the distances between the new data and the four types of data are larger than the preset threshold value, marking the new data as other information.
Specifically, for four classifications: personal identification information, personal medical information, date information, address information, respectively providing corresponding medical field training data, such as: { name }, { age }, { date of visit }, { pharmacy address }.
Putting the training data into a BERT-PCA model, and carrying out text vectorization and principal component normalization processing on the corresponding data, so as to obtain a plurality of reference vectors for each type of data, such as: {1 01 01 01 0},{0 1 01 01 0 1},{0 0 0 01 11 1},{1 11 1 0 0 0 0}. The four sets of vectors are reference vectors for the four classes of data.
When new medical fields are added, such as: the date of the surgery. It is first put into the BERT-PCA model to vector it, for example: {1 11 01 0 0 0}. Next, the new vector is compared with the above-obtained sets of reference vectors, their cosine distances are calculated, and it is found that it is very close to the date reference vector 11 11 0 0 0 0, thus marking the "date of surgery" as a medical field of the "date information" category.
S4, desensitizing the classified information: for personal identity information, an encryption algorithm is used for encrypting the personal identity information, so that leakage of personal privacy information is prevented. For personal medical information, a randomization algorithm is designed to carry out blurring processing on the personal medical information, so that personal privacy is protected, and data can be ensured to be used for intelligent medical services. For example, the age of a patient, the system can add random noise of plus or minus 5 percent of the age on the basis of the real age of the patient, so that the real age of the patient is covered, and the processed data cannot deviate from the real data too far;
the date information includes a date of a patient's visit, a date of an operation, a date of CT taking, and the like. And blurring the date according to specific legal and legal requirements. Such as: only the year and month of the original data are reserved, and the specific day is randomized in the current month, so that if the date of the user's visit is 14 days of 3 months in 2018, the date of the user's visit may be blurred to 11 days of 3 months in 2018;
the address information includes the address of the patient, the pharmacy address for buying the medicine, and the like. And carrying out covering mask processing on the product according to specific legal and regulatory requirements. Such as: only provincial level and municipal level information in the original data is reserved, and information (county level, district level and the like) below the municipal level is subjected to covering processing. Thus, if the patient purchases the medicine from the large-heart pharmacy "Hebei Jizhuang Yuhua Yu Hua Ouyu Xiang Jielian 9 lane 29" the address of the pharmacy is masked to "Hebei Jizhuang" as follows.
The other information is kept in original text, and the medical data in table 1 is subjected to desensitization treatment and then shown in table 2.
Table 2 medical data recording table after desensitization treatment
The final processed medical data is written into a desensitized medical health database for access by intelligent medical developers, wherein the database does not contain privacy information of patients and doctors.
The embodiment also provides a medical data desensitizing system adopting the medical data desensitizing method, which comprises an acquisition module, a data classification module, a sensitive word extraction module, a field classification module, a data desensitizing module and an output module, wherein the acquisition module is used for acquiring medical data, and the data classification module is used for classifying text data and non-text data of the acquired medical data; the sensitive word extraction module is used for extracting keywords in the text data, sending the extracted keywords into the field classification module, and retaining the original text of the non-keywords; the field classification module is used for classifying the non-text data and the keywords into personal identity information, personal medical information, date information, address information and other information; the data desensitization module is used for carrying out desensitization treatment on the information classified by the field classification module: the personal identity information is encrypted, the personal medical information and the date information are subjected to blurring processing, the address information is subjected to mask covering processing to obtain text description, and other information is subjected to original text retaining processing. The output module is used for outputting the desensitized information to the original position of the medical data.
The sensitive word extraction module comprises a Pointer Network model and a BERT-Pointer Network model, wherein the Pointer Network model improves an Attention mechanism of the BERT model based on a transducer frame to obtain the BERT-Pointer Network model; the BERT-Pointer Network model is used to convert text information into word vectors based on context information and extract keywords.
The field classification module comprises a BERT model, a PCA model and a clustering model, wherein the BERT model is used for converting text information into word vectors based on context information; the PCA model is used for carrying out principal component decomposition on the output result of the BERT model, merging similar medical fields and deleting irrelevant medical fields; the clustering model is used for clustering the output result of the PCA model.
The invention is not limited to the above-described alternative embodiments, and any person who may derive other various forms of products in the light of the present invention, however, any changes in shape or structure thereof, all falling within the technical solutions defined in the scope of the claims of the present invention, fall within the scope of protection of the present invention.

Claims (6)

1. A method of desensitizing medical data comprising the steps of:
s1, classifying acquired medical data into text data and non-text data;
s2, extracting keywords in the text data, retaining the original text of non-keywords, and taking the extracted keywords and the non-text data as data to be desensitized;
s3, classifying the data to be desensitized into personal identity information, personal medical information, date information, address information and other information;
s4, desensitizing the classified information: encrypting personal identity information, blurring personal medical information and date information, masking address information to obtain text description, and preserving other information;
in step S1, classifying text data and non-text data according to the names of the fields of the medical data;
in step S2, the Pointer Network model improves an Attention mechanism of the BERT model based on a transducer frame to obtain a BERT-Pointer Network model; the BERT-Pointer Network model converts text information into word vectors based on the context information and extracts keywords in the text data;
the improved Attention mechanism formula is as follows:
Figure QLYQS_1
Figure QLYQS_2
the output sequence of the Pointer Network model is derived from the input sequence, the range of i is the length of the output sequence, the maximum range is preset, and the vector
Figure QLYQS_3
An Attention mask representing the jth input vector; t represents matrix transposition, e j 、d i Is the state quantity, v, W 1 、W 2 Is a learning parameter; c (C) 1 、C i-1 、C i All are random variables representing a certain item in an input sequence, and p is a super-parameter representing joint probability distribution; />
Figure QLYQS_4
The conditional probability of occurrence of item i is known as item i-1 above.
2. The method of desensitizing medical data according to claim 1, wherein step S3 comprises: the BERT model converts text information into word vectors based on context information; the PCA model carries out principal component decomposition on the output result of the BERT model, combines similar medical fields, and deletes irrelevant medical fields; and clustering the output result of the PCA model by using a clustering algorithm.
3. The method of desensitizing medical data according to claim 2, comprising: judging cosine distances between the new data and four types of data, namely personal identity information, personal medical information, date information and address information, through a clustering algorithm, and if the distance between the new data and one type of data is nearest and is lower than a preset threshold value, distributing the new data into the type of data; and if the distances between the new data and the four types of data are larger than the preset threshold value, marking the new data as other information.
4. A medical data desensitization system, comprising:
the data classification module is used for classifying the acquired medical data into text data and non-text data;
the sensitive word extraction module is used for extracting keywords in the text data, sending the extracted keywords into the field classification module and retaining the original text of the non-keywords;
the field classification module is used for classifying the non-text data and the keywords into personal identity information, personal medical information, date information, address information and other information;
the data desensitization module is used for carrying out desensitization treatment on the information classified by the field classification module: encrypting personal identity information, blurring personal medical information and date information, masking address information to obtain text description, and preserving other information;
the sensitive word extraction module comprises:
the Pointer Network model is used for improving the Attention mechanism of the BERT model based on the transducer framework to obtain a BERT-Pointer Network model; the improved Attention mechanism formula is as follows:
Figure QLYQS_5
Figure QLYQS_6
the output sequence of the Pointer Network model is derived from the input sequence, the range of i is the length of the output sequence, the maximum range is preset, and the vector
Figure QLYQS_7
An Attention mask representing the jth input vector; t represents matrix transposition, e j 、d i Is the state quantity, v, W 1 、W 2 Is a learning parameter; c (C) 1 、C i-1 、C i All are random variables representing a certain item in an input sequence, and p is a super-parameter representing joint probability distribution; />
Figure QLYQS_8
The conditional probability that the ith term occurs, given the previous i-1 term;
the BERT-Pointer Network model is used for converting text information into word vectors and extracting keywords based on context information.
5. The medical data desensitization system according to claim 4, wherein the field classification module comprises:
the BERT model is used for converting text information into word vectors based on the context information;
the PCA model is used for carrying out principal component decomposition on the output result of the BERT model, merging similar medical fields and deleting irrelevant medical fields;
and the clustering model is used for clustering the output result of the PCA model.
6. The medical data desensitization system according to claim 4, further comprising: and the output module is used for outputting the desensitized information to the original position of the medical data.
CN202310199626.9A 2023-03-04 2023-03-04 Medical data desensitization method and system Active CN115859372B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310199626.9A CN115859372B (en) 2023-03-04 2023-03-04 Medical data desensitization method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310199626.9A CN115859372B (en) 2023-03-04 2023-03-04 Medical data desensitization method and system

Publications (2)

Publication Number Publication Date
CN115859372A CN115859372A (en) 2023-03-28
CN115859372B true CN115859372B (en) 2023-04-25

Family

ID=85659891

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310199626.9A Active CN115859372B (en) 2023-03-04 2023-03-04 Medical data desensitization method and system

Country Status (1)

Country Link
CN (1) CN115859372B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117112858B (en) * 2023-10-24 2024-02-02 武汉博特智能科技有限公司 Object screening method based on association rule mining, processor and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107145799A (en) * 2017-05-04 2017-09-08 山东浪潮云服务信息科技有限公司 A kind of data desensitization method and device
CN109784015A (en) * 2018-12-27 2019-05-21 腾讯科技(深圳)有限公司 A kind of authentication identifying method and device
CN110135189A (en) * 2019-04-28 2019-08-16 上海市第六人民医院 A kind of patients' privacy information desensitization method towards medical text
CN110289059A (en) * 2019-06-13 2019-09-27 北京百度网讯科技有限公司 Medical data processing method, device, storage medium and electronic equipment
CN113065330A (en) * 2021-03-22 2021-07-02 四川大学 Method for extracting sensitive information from unstructured data
CN114595689A (en) * 2022-02-28 2022-06-07 深圳依时货拉拉科技有限公司 Data processing method, data processing device, storage medium and computer equipment
CN115186051A (en) * 2022-03-08 2022-10-14 马上消费金融股份有限公司 Sensitive word detection method and device and computer readable storage medium
CN115188440A (en) * 2021-12-31 2022-10-14 阳江市人民医院 Intelligent matching method for similar medical records
CN115618371A (en) * 2022-07-11 2023-01-17 上海期货信息技术有限公司 Desensitization method and device for non-text data and storage medium
CN115687980A (en) * 2022-11-11 2023-02-03 中国农业银行股份有限公司 Desensitization classification method of data table, and classification model training method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7454359B2 (en) * 1999-06-23 2008-11-18 Visicu, Inc. System and method for displaying a health status of hospitalized patients

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107145799A (en) * 2017-05-04 2017-09-08 山东浪潮云服务信息科技有限公司 A kind of data desensitization method and device
CN109784015A (en) * 2018-12-27 2019-05-21 腾讯科技(深圳)有限公司 A kind of authentication identifying method and device
CN110135189A (en) * 2019-04-28 2019-08-16 上海市第六人民医院 A kind of patients' privacy information desensitization method towards medical text
CN110289059A (en) * 2019-06-13 2019-09-27 北京百度网讯科技有限公司 Medical data processing method, device, storage medium and electronic equipment
CN113065330A (en) * 2021-03-22 2021-07-02 四川大学 Method for extracting sensitive information from unstructured data
CN115188440A (en) * 2021-12-31 2022-10-14 阳江市人民医院 Intelligent matching method for similar medical records
CN114595689A (en) * 2022-02-28 2022-06-07 深圳依时货拉拉科技有限公司 Data processing method, data processing device, storage medium and computer equipment
CN115186051A (en) * 2022-03-08 2022-10-14 马上消费金融股份有限公司 Sensitive word detection method and device and computer readable storage medium
CN115618371A (en) * 2022-07-11 2023-01-17 上海期货信息技术有限公司 Desensitization method and device for non-text data and storage medium
CN115687980A (en) * 2022-11-11 2023-02-03 中国农业银行股份有限公司 Desensitization classification method of data table, and classification model training method and device

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Pedro J. Rosa等.Effects of fear-relevant stimuli on attention: Integrating gaze data with subliminal exposure.《2014 IEEE International Symposium on Medical Measurements and Applications (MeMeA)》.2014,第1-6页. *
唐迪等.数据脱敏技术发展趋势.《保密科学技术》.2020,(第4期),第4-11页. *
张勇.面向金融的文本分析及摘要生成技术研究与实现.《中国优秀硕士学位论文全文数据库》.2022,信息科技辑 I138-3540. *
谢沂林等.基于图数据库的电子病历存储方法.《信息技术与信息化》.2021,(第8期),第134-137页. *
郑旭如.基于深度学习的数据脱敏研究.《中国优秀硕士学位论文全文数据库》.2021,信息科技辑 I138-2548. *

Also Published As

Publication number Publication date
CN115859372A (en) 2023-03-28

Similar Documents

Publication Publication Date Title
CN109669994B (en) Construction method and system of health knowledge map
Li et al. Anonymizing and sharing medical text records
Bennetot et al. Towards explainable neural-symbolic visual reasoning
Lee et al. Generating sequential electronic health records using dual adversarial autoencoder
Psychoula et al. A deep learning approach for privacy preservation in assisted living
Blanco-Justicia et al. Machine learning explainability through comprehensible decision trees
Gupta et al. PCA-RF: an efficient Parkinson's disease prediction model based on random forest classification
CN115859372B (en) Medical data desensitization method and system
CN111709233A (en) Intelligent diagnosis guiding method and system based on multi-attention convolutional neural network
Sousa et al. How to keep text private? A systematic review of deep learning methods for privacy-preserving natural language processing
McAteer et al. Integration of biometrics and steganography: a comprehensive review
CN109522740B (en) Health data privacy removal processing method and system
Zhong et al. A Group-Based Personalized Model for Image Privacy Classification and Labeling.
Chaturvedi et al. It’s all in the name: A character-based approach to infer religion
CN111680131A (en) Document clustering method and system based on semantics and computer equipment
CN111986759A (en) Method and system for analyzing electronic medical record, computer equipment and readable storage medium
CN112966517A (en) Training method, device, equipment and medium for named entity recognition model
Saeedi et al. Consumer artificial intelligence mishaps and mitigation strategies
CN113239668B (en) Keyword intelligent extraction method and device, computer equipment and storage medium
Papadopoulou et al. Bootstrapping text anonymization models with distant supervision
CN108122613B (en) Health prediction method and device based on health prediction model
CN113064972A (en) Intelligent question and answer method, device, equipment and storage medium
CN111104481B (en) Method, device and equipment for identifying matching field
Kanwal et al. Fuzz-classification (p, l)-Angel: An enhanced hybrid artificial intelligence based fuzzy logic for multiple sensitive attributes against privacy breaches
CN116502261A (en) Data desensitization method and device for retaining data characteristics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Li Rui

Inventor after: Hu Qitong

Inventor after: Liu Ruihua

Inventor after: Zheng Mingyang

Inventor before: Li Rui

Inventor before: Hu Qitong

Inventor before: Liu Ruihua

Inventor before: Zheng Mingyang

Inventor before: Tang Xuewen