CN116502258A - Sensitive information desensitization and recognition system - Google Patents
Sensitive information desensitization and recognition system Download PDFInfo
- Publication number
- CN116502258A CN116502258A CN202310257477.7A CN202310257477A CN116502258A CN 116502258 A CN116502258 A CN 116502258A CN 202310257477 A CN202310257477 A CN 202310257477A CN 116502258 A CN116502258 A CN 116502258A
- Authority
- CN
- China
- Prior art keywords
- data
- patient
- sensitive information
- information
- mapping
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000586 desensitisation Methods 0.000 title claims abstract description 39
- 238000002372 labelling Methods 0.000 claims abstract description 16
- 238000013524 data verification Methods 0.000 claims abstract description 10
- 238000012545 processing Methods 0.000 claims abstract description 5
- 238000013507 mapping Methods 0.000 claims description 56
- 238000000034 method Methods 0.000 claims description 15
- 230000008569 process Effects 0.000 claims description 12
- 238000012795 verification Methods 0.000 claims description 11
- 230000003993 interaction Effects 0.000 claims description 8
- 238000012216 screening Methods 0.000 claims description 5
- 230000035945 sensitivity Effects 0.000 claims description 4
- 230000006870 function Effects 0.000 claims description 3
- 230000008707 rearrangement Effects 0.000 claims description 3
- 238000001514 detection method Methods 0.000 claims description 2
- 238000000605 extraction Methods 0.000 claims description 2
- 238000003058 natural language processing Methods 0.000 claims description 2
- 238000005070 sampling Methods 0.000 claims description 2
- 238000006467 substitution reaction Methods 0.000 claims description 2
- 230000009466 transformation Effects 0.000 claims 1
- 238000011160 research Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000010835 comparative analysis Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000013479 data entry Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
- G06F21/6254—Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H15/00—ICT specially adapted for medical reports, e.g. generation or transmission thereof
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Medical Informatics (AREA)
- Bioethics (AREA)
- General Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Public Health (AREA)
- Medical Treatment And Welfare Office Work (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a sensitive information desensitization and identification system, which particularly relates to the technical field of data processing, and comprises a patient sensitive information identification module, a patient sensitive information grading module, a privacy information definition and labeling module, a text data desensitization module, an image data desensitization module and a data verification module, wherein the patient sensitive information identification module establishes a discovery rule, inputs data to be linked with a database, and is matched in the database through the discovery rule.
Description
Technical Field
The invention relates to the technical field of data processing, in particular to a sensitive information desensitizing and identifying system.
Background
The sensitive data has certain property attribute for the personal data owners, the data owners have the related rights of ownership, possession, dominance, use right and disposal right of the sensitive data by law, and in the process of information sharing, the process of the data owners exercising the dominance and the use right is the process of the data owners exercising the dominance and the use right, and along with the branching and authorized use of the relational data, the leakage problem of the sensitive data is increasingly serious.
In the traditional method, the desensitization treatment effect of user sensitive data is not ideal in the data interaction process, the desensitization treatment time is long, in order to improve the research efficiency in clinical research, mobile phone photographing cases and laboratory sheets are commonly adopted at present, then the mobile phone photographing cases and laboratory sheets are desensitized and uploaded, and then the mobile phone photographing cases and laboratory sheets are processed in an OCR recognition mode, however, once the key information is desensitized, the authenticity of the laboratory sheets cannot be judged. If the digital technology is not desensitized, the risk of leakage of private information exists, which is a key bottleneck for restricting the development of clinical research of the digital technology at present.
Disclosure of Invention
In order to overcome the above-mentioned drawbacks of the prior art, an embodiment of the present invention provides a system for desensitizing and identifying sensitive information, which solves the problems set forth in the above-mentioned background art by means of a text data desensitizing module.
In order to achieve the above purpose, the present invention provides the following technical solutions: the system comprises a patient sensitive information identification module, a patient sensitive information grading module, a privacy information definition and labeling module, a text data desensitizing module, an image data desensitizing module and a data verification module, wherein the patient sensitive information identification module establishes a discovery rule, inputs data and links with a database, the data are matched in the database through the discovery rule, the patient sensitive information grading module divides the data information into two different grades according to the sensitivity degree of patient information, the privacy information definition and labeling module carries out simple code labeling on basic information of a patient, the text data desensitizing module constructs a replacement dictionary, candidate values which can be replaced are added into the replacement dictionary, when data desensitization is carried out, a replacement value is randomly selected, reordering is carried out according to a fixed rule, the image data desensitizing module carries out desensitization processing on the image data through an algorithm, and the data verification module carries out joint verification through the data simple code and the image desensitization, and verifies the consistency of the different assimilation sheets and the patient information.
In a preferred embodiment, the patient sensitive information recognition module performs a primary recognition of user sensitive information on the obtained data by linking with a data input source in a patient data input stage to obtain local data and metadata, performs a primary recognition of the user sensitive information on the obtained data, performs a recognition of the sensitive information of the patient according to the type and content of the data, and performs a recognition of the sensitive information of the patient by using a sensitive information recognition engine.
In a preferred embodiment, the patient sensitive information grading module is used for protecting the key points of the patient sensitive information in the data interaction process, grading the identified patient sensitive information according to the usability, the integrity and the confidentiality of the patient sensitive information, grading the data information into two grades, considering the degree of potential threat and economic loss caused by relevant information leakage of the patient during grading, and completing the sensitive grading of the user sensitive information in the data interaction process based on the consideration of the key factors of grading.
In a preferred embodiment, the privacy information definition and labeling module includes that the name, the hospitalization number and the sickbed number of the patient are all privacy information, when the patient is photographed and uploaded, sensitive information is required to be covered, then the sensitive information is uploaded, the name of the patient is covered by adopting initials as a brevity code, the name of the patient is covered by a black block, the initials brevity code is placed at a covering part in a white font, the hospitalization number is reserved, the hospitalization number is placed at the covering part, and the sickbed number is completely covered.
In a preferred embodiment, the word data desensitization module constructs a word replacement dictionary, replaces sensitive information with corresponding words, converts the replaced sentences, rewrites the corresponding sentences, lists all candidate values which can be used as replacement values, randomly selects one candidate value from the list during replacement, selects parameterized substitution according to a certain rule, takes data to be desensitized as the input of certain function mapping, transforms to obtain desensitized data, rearranges numerical values, date and time type attribute values according to the rule, rearranges identification card numbers of patients according to the rule, rearranges the rule rearrangement according to fixed rule bits, can restore the data according to an inverse rule, establishes a desensitization algorithm intermediate mapping table, wherein the field of the mapping table consists of an outpatient number, a mapping inpatient number and a mapping name, the mapping outpatient number is obtained through a mapping relation, consists of a fixed letter and a random number, the mapping number meets the principle that the inpatient number is not repeated, the mapping rule is obtained through the mapping relation, and the mapping rule is identical with the mapping number of the mapping rule, and the mapping rule is identical to the mapping rule of the outpatient number, and the mapping number is a common name and the mapping algorithm is not repeated in the mapping table.
In a preferred embodiment, the picture data desensitizing module determines the main image position, reads information from the images in the database, prints information label names on the images, performs label screening according to the label names, operates on sensitive information obtained after screening, distributes the sensitive information of the patient images around the images, and performs steganography and reading through an Outgusee algorithm.
In a preferred embodiment, the data verification module performs combined verification of the data shortcode, the picture steganography and the short name of the patient in the EDC, verifies the consistency of the laboratory sheets, and verifies the consistency of different laboratory sheets and medical history and patient information through the mantissa of the patient in the hospital.
In a preferred embodiment, the sensitive information desensitizing and identifying system specifically comprises the following operation steps:
s1, through linking input data and a database, according to data characteristics configured in a discovery rule, the higher the matching degree is, the higher the identification rate of a sensitive object is;
s2, dividing the data information into two different levels according to the availability, the integrity and the confidentiality of the sensitive information of the patient;
s3, labeling the name, the hospitalization number and the sickbed number of the patient by a brief code;
s4, constructing a word replacement dictionary, listing candidate values which can be used as replacement values, randomly selecting the replacement values, and reordering according to a fixed rule;
s5, desensitizing the image data through an Outgusee algorithm;
and S6, carrying out joint verification through data brevity codes and image desensitization, and determining the consistency of the laboratory sheet and the patient.
The invention has the technical effects and advantages that:
the invention realizes the desensitization treatment of the patient information by identifying and grading the sensitive information of the patient and limits the output format of the user information after desensitization by the method for desensitizing the sensitive information of the patient and limiting the format of the sensitive information of the patient in the patient data interaction process, and realizes multiple correction of privacy information after desensitization by adopting desensitization, brief code labeling and picture steganography technology, thereby solving the problem of error possibly caused by individual information errors in OCR identification and completely solving the risk of privacy disclosure.
Drawings
FIG. 1 is a flow chart of the system of the present invention.
Fig. 2 is a block diagram of the system architecture of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
The embodiment provides a sensitive information desensitization and identification system as shown in fig. 1, which comprises a patient sensitive information identification module, a patient sensitive information grading module, a privacy information definition and labeling module, a text data desensitization module, an image data desensitization module and a data verification module, wherein the patient sensitive information identification module establishes a discovery rule, inputs data and links the data with a database, the data are matched in the database through the discovery rule, the patient sensitive information grading module divides the data information into two different grades according to the sensitivity degree of the patient information, the privacy information definition and labeling module carries out simple code labeling on basic information of a patient, the text data desensitization module constructs a replacement dictionary, candidate values which can be replaced are added into the replacement dictionary, when data desensitization is carried out, a replacement value is randomly selected, reordering is carried out according to a fixed rule, the image data desensitization module carries out desensitization processing on the image data through an algorithm, and the data verification module carries out joint verification through the data simple code and the image desensitization, and the consistency of the different verification lists and the patient information is verified.
As shown in fig. 2, the embodiment provides a sensitive information desensitizing and identifying system, which specifically includes the following steps:
101. by linking the input data with the database, the higher the matching degree is, the higher the identification rate of the sensitive object is according to the data characteristics configured in the discovery rule;
in this embodiment, it is specifically to be described that a patient sensitive information identification module, where the patient sensitive information identification module performs, in a patient data entry stage, verification of connectivity of a data source by linking with the entered data source, obtains local data and metadata, performs primary identification of user sensitive information on the obtained data, performs identification of sensitive information of a patient according to a category and content of the obtained data, implements identification of sensitive information of the patient by using a sensitive information identification engine, and the identification manner includes extraction of feature words in a database, rules, and natural language processing, stores data fields of the patient in an identification library of sensitive fields, and during a task of sensitive information identification, identifies data fields with focus on length, the method is characterized in that the efficiency of identifying and desensitizing the sensitive information of a patient is improved, all sentences in a database are analyzed based on data recorded by the patient, the sensitive information is identified in the sensitive information database, according to data characteristics configured in a discovery rule, field types and sample data are combined, field data in the database are subjected to comparative analysis, the matching degree with the discovery rule is obtained, when the matching degree reaches a set threshold value, the discovery rule is determined to be matched, the more the sample data, the higher the identification rate of a sensitive object, metadata with privacy characteristics can be quickly obtained through sampling analysis of the data, the sensitive data can be automatically discovered, the detection data comprises names, certificate numbers, bank accounts, addresses and telephone numbers, the sensitive data are reminded, and the configuration of the desensitization rule can be directly entered through a guiding mode.
102. Dividing the data information into two different levels according to the availability, integrity and confidentiality of the patient sensitive information;
in this embodiment, a specific description is provided of a patient sensitive information grading module, where the patient sensitive information grading module is an important point for protecting patient sensitive information in a data interaction process, and divides the identified patient sensitive information into two grades according to availability, integrity and confidentiality of the patient sensitive information, and when dividing, the data information needs to be considered to consider the degree of potential threat and economic loss caused by relevant information leakage of the patient, and based on consideration of key factors of grading, the sensitive grade division of the user sensitive information in the data interaction process is completed.
103. The name, the hospitalization number and the sickbed number of the patient are marked by brief codes;
in this embodiment, it is specifically required to explain that the privacy information definition and labeling module, where the name, the number of hospitalization, and the number of the hospital bed of the patient are all privacy information, when the patient is photographed and uploaded, the sensitive information needs to be covered, then uploaded, the name of the patient is covered by using the initials as the shorthand, the name of the patient is covered by using the black block, the initials shorthand is placed at the covering position in a white font, the number of hospitalization is reserved, and the patient is placed at the covering position, and the number of the hospital bed is completely covered.
104. Constructing a word replacement dictionary, listing candidate values which can be used as replacement values, randomly selecting the replacement values, and reordering according to a fixed rule;
in this embodiment, a text data desensitization module is specifically needed to be described, the text data desensitization module constructs a word replacement dictionary, uses corresponding words to replace sensitive information, transforms a sentence after replacement, rewrites a corresponding sentence, lists all candidate values which can be used as replacement values, randomly selects a candidate value from the list during replacement, selects the candidate value according to a certain rule, parameterizes the candidate value to be desensitized as an input of mapping of certain functions, transforms the data to obtain the desensitized data, performs conversion offset according to rules, performs rule rearrangement on the identification card number of a patient, rearranges the bits according to fixed rules, can recover the data according to inverse rules, creates a desensitization algorithm intermediate mapping table, wherein the mapping table field consists of an outpatient number, a mapping outpatient number and a mapping name, the mapping outpatient number is obtained by a mapping relation, consists of a fixed letter and a random number, the random number meets a non-repeated principle, the mapping number is obtained by a mapping relation, the mapping rule and the mapping number is the same as the mapping rule of the outpatient number and the mapping number is a common name and the mapping algorithm is not matched with the common name and has no mapping rule in the mapping algorithm.
105. Desensitizing the image data by an Outgusee algorithm;
in this embodiment, a specific description is provided of a picture data desensitizing module, where the picture data desensitizing module determines a main image position, reads information from an image in a database, prints an information tag name on the image, performs tag screening according to the tag name, operates on sensitive information obtained after screening, distributes the sensitive information of a patient image around the image, and performs steganography and reading through an Outgusee algorithm.
106. The consistency of the laboratory sheet and the patient is determined by carrying out joint verification through the data brevity code and the image desensitization;
in this embodiment, a specific description is provided of the data verification module, which performs three combined verification according to the data shortcode, the picture steganography and the abbreviation of the patient in the EDC, verifies the consistency of the laboratory sheet, and verifies the consistency of different laboratory sheets and medical history and patient information through the patient's inpatient number mantissa.
Finally: the foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.
Claims (8)
1. A sensitive information desensitizing and identifying system, characterized by: the system comprises a patient sensitive information identification module, a patient sensitive information grading module, a privacy information definition and labeling module, a text data desensitizing module, an image data desensitizing module and a data verification module, wherein the patient sensitive information identification module establishes a discovery rule, inputs data and links with a database, the data are matched in the database through the discovery rule, the patient sensitive information grading module divides the data information into two different grades according to the sensitivity degree of patient information, the privacy information definition and labeling module carries out simple code labeling on basic information of a patient, the text data desensitizing module constructs a replacement dictionary, candidate values which can be replaced are added into the replacement dictionary, when data desensitization is carried out, a replacement value is randomly selected, reordering is carried out according to a fixed rule, the image data desensitizing module carries out desensitization processing on the image data through an algorithm, and the data verification module carries out joint verification through the data simple code and the image desensitization, and verifies the consistency of the different assimilation sheets and the patient information.
2. A sensitive information desensitizing and identifying system according to claim 1, wherein: the patient sensitive information identification module realizes the detection of connectivity of a data source through the link with the input data source in the patient data input stage to obtain local data and metadata, performs primary identification of user sensitive information on the obtained data, performs identification on the obtained data, performs sensitive information identification of a patient according to the type and the content of the data, adopts a sensitive information identification engine to realize the identification of the sensitive information of the patient, adopts a characteristic word extraction in a database, rules and natural language processing, stores data fields of the patient in an identification library of the sensitive fields, and mainly identifies long data fields in the task process of the sensitive information identification so as to improve the efficiency of the sensitive information identification and the sensitive information desensitization of the patient, analyzing all sentences in a database based on data recorded by a patient, identifying sensitive information in the sensitive information database, carrying out comparison analysis on field data in the database according to data characteristics configured in a discovery rule and combining field types with sample data to obtain the matching degree with the discovery rule, when the matching degree reaches a set threshold value, recognizing that the discovery rule is matched, the more the sample data is, the higher the identification rate of a sensitive object is, quickly combing metadata with privacy characteristics through sampling analysis on the data, automatically discovering the sensitive data, prompting the sensitive data including names, certificate numbers, bank accounts, addresses and telephone numbers, and directly entering configuration of a desensitization rule through a guiding mode.
3. A sensitive information desensitizing and identifying system according to claim 1, wherein: the patient sensitive information grading module is used for protecting the key points of the patient sensitive information in the data interaction process, and dividing the identified patient sensitive information into two grades according to the usability, the integrity and the confidentiality of the patient sensitive information, wherein the degree of potential threat and economic loss caused by relevant information leakage of the patient is required to be considered during the division, and the sensitivity grade division of the user sensitive information in the data interaction process is completed based on the consideration of the grade division key factors.
4. A sensitive information desensitizing and identifying system according to claim 1, wherein: the privacy information definition and labeling module is characterized in that the name, the hospitalization number and the sickbed number of a patient are all privacy information, when photographing and uploading are carried out, sensitive information is required to be covered, then uploading is carried out, the name of the patient is covered by a black block by adopting initials as shorthand codes, the initials shorthand codes are placed at a covering part in a white font, the hospitalization number is reserved, four positions are reserved, the patient is placed at the covering part, and the sickbed number is completely covered.
5. A sensitive information desensitizing and identifying system according to claim 1, wherein: the word data desensitization module constructs a word replacement dictionary, uses corresponding word replacement sensitive information to convert a replaced sentence, rewrites the corresponding sentence, and randomly selects a candidate value from the candidate values which can be used as replacement values when in replacement, selects the candidate value according to a certain rule, carries out parameterization substitution, uses data to be desensitized as the input of certain function mapping, carries out transformation and offset according to rules to obtain attribute values of numerical values, dates and time types, carries out rule rearrangement to the identification card number of a patient, rearranges the rules according to fixed rule pairs, can restore the data according to inverse rules, establishes a desensitization algorithm intermediate mapping table, wherein fields of the mapping table consist of clinic numbers, mapping clinic numbers and mapping names, the mapping clinic numbers are obtained through mapping relations, are obtained through mapping relations according to a fixed and random numbers, the random numbers meet the non-repeated principle, the mapping rule is the same as that the mapping rule of the mapping clinic numbers, when the mapping numbers are blank numbers, the mapping rule is the same as that the mapping rule of the mapping clinic numbers, the mapping rule is the common name is formed by the random mapping table, and the common name mapping algorithm is not repeated in the country mapping table, and the common name mapping algorithm is not repeated, and the common name mapping algorithm is formed in the common mapping algorithm.
6. A sensitive information desensitizing and identifying system according to claim 1, wherein: the picture data desensitization module determines the main image position, reads information of the images in the database, prints information label names of the images, performs label screening according to the label names, operates the screened sensitive information, distributes the sensitive information of the patient images around the images, and performs steganography and reading through an Outgasee algorithm.
7. A sensitive information desensitizing and identifying system according to claim 1, wherein: the data verification module performs three combined verification according to the data brief code, the picture steganography and the short name of the patient in the EDC, confirms the consistency of the laboratory sheets, and verifies the consistency of different laboratory sheets, medical history and patient information through the inpatient number mantissa of the patient.
8. A sensitive information desensitizing and identifying system according to claims 1-7, wherein: the specific operation steps are as follows:
s1, through linking input data and a database, according to data characteristics configured in a discovery rule, the higher the matching degree is, the higher the identification rate of a sensitive object is;
s2, dividing the data information into two different levels according to the availability, the integrity and the confidentiality of the sensitive information of the patient;
s3, labeling the name, the hospitalization number and the sickbed number of the patient by a brief code;
s4, constructing a word replacement dictionary, listing candidate values which can be used as replacement values, randomly selecting the replacement values, and reordering according to a fixed rule;
s5, desensitizing the image data through an Outgusee algorithm;
and S6, carrying out joint verification through data brevity codes and image desensitization, and determining the consistency of the laboratory sheet and the patient.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310257477.7A CN116502258A (en) | 2023-03-16 | 2023-03-16 | Sensitive information desensitization and recognition system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310257477.7A CN116502258A (en) | 2023-03-16 | 2023-03-16 | Sensitive information desensitization and recognition system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116502258A true CN116502258A (en) | 2023-07-28 |
Family
ID=87317388
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310257477.7A Pending CN116502258A (en) | 2023-03-16 | 2023-03-16 | Sensitive information desensitization and recognition system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116502258A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117349879A (en) * | 2023-09-11 | 2024-01-05 | 江苏汉康东优信息技术有限公司 | Text data anonymization privacy protection method based on continuous word bag model |
CN118350050A (en) * | 2024-06-12 | 2024-07-16 | 山东浪潮科学研究院有限公司 | Data desensitizing method, device, electronic equipment, storage medium and computer program |
-
2023
- 2023-03-16 CN CN202310257477.7A patent/CN116502258A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117349879A (en) * | 2023-09-11 | 2024-01-05 | 江苏汉康东优信息技术有限公司 | Text data anonymization privacy protection method based on continuous word bag model |
CN118350050A (en) * | 2024-06-12 | 2024-07-16 | 山东浪潮科学研究院有限公司 | Data desensitizing method, device, electronic equipment, storage medium and computer program |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116502258A (en) | Sensitive information desensitization and recognition system | |
US20210327000A1 (en) | Systems and methods for insurance fraud detection | |
US20240119177A1 (en) | Method and system for automated text anonymisation | |
Beckwith et al. | Development and evaluation of an open source software tool for deidentification of pathology reports | |
CN103294746B (en) | For the method and system for going identification in visual media data | |
US8468167B2 (en) | Automatic data validation and correction | |
CN105653984B (en) | File fingerprint method of calibration and device | |
US9372916B2 (en) | Document template auto discovery | |
US11120221B2 (en) | Method and system to resolve ambiguities in regulations | |
US20140215301A1 (en) | Document template auto discovery | |
CN112257446A (en) | Named entity recognition method and device, computer equipment and readable storage medium | |
US20040162831A1 (en) | Document handling system and method | |
US20230282322A1 (en) | System and method for anonymizing medical records | |
CN112508145B (en) | Electronic seal generation and verification method and device, electronic equipment and storage medium | |
EP4185984A1 (en) | Classifying pharmacovigilance documents using image analysis | |
Mohammadi et al. | Weakly supervised learning and interpretability for endometrial whole slide image diagnosis | |
CN113221762A (en) | Cost balance decision method, insurance claim settlement decision method, device and equipment | |
CN116052848B (en) | Data coding method and system for medical imaging quality control | |
CN116469526A (en) | Training method, device, equipment and storage medium for traditional Chinese medicine diagnosis model | |
JP4679955B2 (en) | Wound and disease name coding method and wound and disease name coding program | |
Mosquera-Zamudio et al. | A Spitzoid Tumor dataset with clinical metadata and Whole Slide Images for Deep Learning models | |
JP2020140583A (en) | Dictionary creation device, dictionary creation method, and dictionary creation program | |
CN113592523B (en) | Financial data processing system and method | |
Etter et al. | Project SEARCH (Scanning EARs for Child Health): validating an ear biometric tool for patient identification in Zambia | |
CN113947510A (en) | Real estate electronic license management system based on file format self-adaptation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |