CN116502258A - Sensitive information desensitization and recognition system - Google Patents

Sensitive information desensitization and recognition system Download PDF

Info

Publication number
CN116502258A
CN116502258A CN202310257477.7A CN202310257477A CN116502258A CN 116502258 A CN116502258 A CN 116502258A CN 202310257477 A CN202310257477 A CN 202310257477A CN 116502258 A CN116502258 A CN 116502258A
Authority
CN
China
Prior art keywords
data
patient
sensitive information
information
mapping
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310257477.7A
Other languages
Chinese (zh)
Inventor
张发宝
李欣梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Medsci Medical Technology Co ltd
Original Assignee
Shanghai Medsci Medical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Medsci Medical Technology Co ltd filed Critical Shanghai Medsci Medical Technology Co ltd
Priority to CN202310257477.7A priority Critical patent/CN116502258A/en
Publication of CN116502258A publication Critical patent/CN116502258A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6254Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H15/00ICT specially adapted for medical reports, e.g. generation or transmission thereof
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Medical Informatics (AREA)
  • Bioethics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a sensitive information desensitization and identification system, which particularly relates to the technical field of data processing, and comprises a patient sensitive information identification module, a patient sensitive information grading module, a privacy information definition and labeling module, a text data desensitization module, an image data desensitization module and a data verification module, wherein the patient sensitive information identification module establishes a discovery rule, inputs data to be linked with a database, and is matched in the database through the discovery rule.

Description

Sensitive information desensitization and recognition system
Technical Field
The invention relates to the technical field of data processing, in particular to a sensitive information desensitizing and identifying system.
Background
The sensitive data has certain property attribute for the personal data owners, the data owners have the related rights of ownership, possession, dominance, use right and disposal right of the sensitive data by law, and in the process of information sharing, the process of the data owners exercising the dominance and the use right is the process of the data owners exercising the dominance and the use right, and along with the branching and authorized use of the relational data, the leakage problem of the sensitive data is increasingly serious.
In the traditional method, the desensitization treatment effect of user sensitive data is not ideal in the data interaction process, the desensitization treatment time is long, in order to improve the research efficiency in clinical research, mobile phone photographing cases and laboratory sheets are commonly adopted at present, then the mobile phone photographing cases and laboratory sheets are desensitized and uploaded, and then the mobile phone photographing cases and laboratory sheets are processed in an OCR recognition mode, however, once the key information is desensitized, the authenticity of the laboratory sheets cannot be judged. If the digital technology is not desensitized, the risk of leakage of private information exists, which is a key bottleneck for restricting the development of clinical research of the digital technology at present.
Disclosure of Invention
In order to overcome the above-mentioned drawbacks of the prior art, an embodiment of the present invention provides a system for desensitizing and identifying sensitive information, which solves the problems set forth in the above-mentioned background art by means of a text data desensitizing module.
In order to achieve the above purpose, the present invention provides the following technical solutions: the system comprises a patient sensitive information identification module, a patient sensitive information grading module, a privacy information definition and labeling module, a text data desensitizing module, an image data desensitizing module and a data verification module, wherein the patient sensitive information identification module establishes a discovery rule, inputs data and links with a database, the data are matched in the database through the discovery rule, the patient sensitive information grading module divides the data information into two different grades according to the sensitivity degree of patient information, the privacy information definition and labeling module carries out simple code labeling on basic information of a patient, the text data desensitizing module constructs a replacement dictionary, candidate values which can be replaced are added into the replacement dictionary, when data desensitization is carried out, a replacement value is randomly selected, reordering is carried out according to a fixed rule, the image data desensitizing module carries out desensitization processing on the image data through an algorithm, and the data verification module carries out joint verification through the data simple code and the image desensitization, and verifies the consistency of the different assimilation sheets and the patient information.
In a preferred embodiment, the patient sensitive information recognition module performs a primary recognition of user sensitive information on the obtained data by linking with a data input source in a patient data input stage to obtain local data and metadata, performs a primary recognition of the user sensitive information on the obtained data, performs a recognition of the sensitive information of the patient according to the type and content of the data, and performs a recognition of the sensitive information of the patient by using a sensitive information recognition engine.
In a preferred embodiment, the patient sensitive information grading module is used for protecting the key points of the patient sensitive information in the data interaction process, grading the identified patient sensitive information according to the usability, the integrity and the confidentiality of the patient sensitive information, grading the data information into two grades, considering the degree of potential threat and economic loss caused by relevant information leakage of the patient during grading, and completing the sensitive grading of the user sensitive information in the data interaction process based on the consideration of the key factors of grading.
In a preferred embodiment, the privacy information definition and labeling module includes that the name, the hospitalization number and the sickbed number of the patient are all privacy information, when the patient is photographed and uploaded, sensitive information is required to be covered, then the sensitive information is uploaded, the name of the patient is covered by adopting initials as a brevity code, the name of the patient is covered by a black block, the initials brevity code is placed at a covering part in a white font, the hospitalization number is reserved, the hospitalization number is placed at the covering part, and the sickbed number is completely covered.
In a preferred embodiment, the word data desensitization module constructs a word replacement dictionary, replaces sensitive information with corresponding words, converts the replaced sentences, rewrites the corresponding sentences, lists all candidate values which can be used as replacement values, randomly selects one candidate value from the list during replacement, selects parameterized substitution according to a certain rule, takes data to be desensitized as the input of certain function mapping, transforms to obtain desensitized data, rearranges numerical values, date and time type attribute values according to the rule, rearranges identification card numbers of patients according to the rule, rearranges the rule rearrangement according to fixed rule bits, can restore the data according to an inverse rule, establishes a desensitization algorithm intermediate mapping table, wherein the field of the mapping table consists of an outpatient number, a mapping inpatient number and a mapping name, the mapping outpatient number is obtained through a mapping relation, consists of a fixed letter and a random number, the mapping number meets the principle that the inpatient number is not repeated, the mapping rule is obtained through the mapping relation, and the mapping rule is identical with the mapping number of the mapping rule, and the mapping rule is identical to the mapping rule of the outpatient number, and the mapping number is a common name and the mapping algorithm is not repeated in the mapping table.
In a preferred embodiment, the picture data desensitizing module determines the main image position, reads information from the images in the database, prints information label names on the images, performs label screening according to the label names, operates on sensitive information obtained after screening, distributes the sensitive information of the patient images around the images, and performs steganography and reading through an Outgusee algorithm.
In a preferred embodiment, the data verification module performs combined verification of the data shortcode, the picture steganography and the short name of the patient in the EDC, verifies the consistency of the laboratory sheets, and verifies the consistency of different laboratory sheets and medical history and patient information through the mantissa of the patient in the hospital.
In a preferred embodiment, the sensitive information desensitizing and identifying system specifically comprises the following operation steps:
s1, through linking input data and a database, according to data characteristics configured in a discovery rule, the higher the matching degree is, the higher the identification rate of a sensitive object is;
s2, dividing the data information into two different levels according to the availability, the integrity and the confidentiality of the sensitive information of the patient;
s3, labeling the name, the hospitalization number and the sickbed number of the patient by a brief code;
s4, constructing a word replacement dictionary, listing candidate values which can be used as replacement values, randomly selecting the replacement values, and reordering according to a fixed rule;
s5, desensitizing the image data through an Outgusee algorithm;
and S6, carrying out joint verification through data brevity codes and image desensitization, and determining the consistency of the laboratory sheet and the patient.
The invention has the technical effects and advantages that:
the invention realizes the desensitization treatment of the patient information by identifying and grading the sensitive information of the patient and limits the output format of the user information after desensitization by the method for desensitizing the sensitive information of the patient and limiting the format of the sensitive information of the patient in the patient data interaction process, and realizes multiple correction of privacy information after desensitization by adopting desensitization, brief code labeling and picture steganography technology, thereby solving the problem of error possibly caused by individual information errors in OCR identification and completely solving the risk of privacy disclosure.
Drawings
FIG. 1 is a flow chart of the system of the present invention.
Fig. 2 is a block diagram of the system architecture of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
The embodiment provides a sensitive information desensitization and identification system as shown in fig. 1, which comprises a patient sensitive information identification module, a patient sensitive information grading module, a privacy information definition and labeling module, a text data desensitization module, an image data desensitization module and a data verification module, wherein the patient sensitive information identification module establishes a discovery rule, inputs data and links the data with a database, the data are matched in the database through the discovery rule, the patient sensitive information grading module divides the data information into two different grades according to the sensitivity degree of the patient information, the privacy information definition and labeling module carries out simple code labeling on basic information of a patient, the text data desensitization module constructs a replacement dictionary, candidate values which can be replaced are added into the replacement dictionary, when data desensitization is carried out, a replacement value is randomly selected, reordering is carried out according to a fixed rule, the image data desensitization module carries out desensitization processing on the image data through an algorithm, and the data verification module carries out joint verification through the data simple code and the image desensitization, and the consistency of the different verification lists and the patient information is verified.
As shown in fig. 2, the embodiment provides a sensitive information desensitizing and identifying system, which specifically includes the following steps:
101. by linking the input data with the database, the higher the matching degree is, the higher the identification rate of the sensitive object is according to the data characteristics configured in the discovery rule;
in this embodiment, it is specifically to be described that a patient sensitive information identification module, where the patient sensitive information identification module performs, in a patient data entry stage, verification of connectivity of a data source by linking with the entered data source, obtains local data and metadata, performs primary identification of user sensitive information on the obtained data, performs identification of sensitive information of a patient according to a category and content of the obtained data, implements identification of sensitive information of the patient by using a sensitive information identification engine, and the identification manner includes extraction of feature words in a database, rules, and natural language processing, stores data fields of the patient in an identification library of sensitive fields, and during a task of sensitive information identification, identifies data fields with focus on length, the method is characterized in that the efficiency of identifying and desensitizing the sensitive information of a patient is improved, all sentences in a database are analyzed based on data recorded by the patient, the sensitive information is identified in the sensitive information database, according to data characteristics configured in a discovery rule, field types and sample data are combined, field data in the database are subjected to comparative analysis, the matching degree with the discovery rule is obtained, when the matching degree reaches a set threshold value, the discovery rule is determined to be matched, the more the sample data, the higher the identification rate of a sensitive object, metadata with privacy characteristics can be quickly obtained through sampling analysis of the data, the sensitive data can be automatically discovered, the detection data comprises names, certificate numbers, bank accounts, addresses and telephone numbers, the sensitive data are reminded, and the configuration of the desensitization rule can be directly entered through a guiding mode.
102. Dividing the data information into two different levels according to the availability, integrity and confidentiality of the patient sensitive information;
in this embodiment, a specific description is provided of a patient sensitive information grading module, where the patient sensitive information grading module is an important point for protecting patient sensitive information in a data interaction process, and divides the identified patient sensitive information into two grades according to availability, integrity and confidentiality of the patient sensitive information, and when dividing, the data information needs to be considered to consider the degree of potential threat and economic loss caused by relevant information leakage of the patient, and based on consideration of key factors of grading, the sensitive grade division of the user sensitive information in the data interaction process is completed.
103. The name, the hospitalization number and the sickbed number of the patient are marked by brief codes;
in this embodiment, it is specifically required to explain that the privacy information definition and labeling module, where the name, the number of hospitalization, and the number of the hospital bed of the patient are all privacy information, when the patient is photographed and uploaded, the sensitive information needs to be covered, then uploaded, the name of the patient is covered by using the initials as the shorthand, the name of the patient is covered by using the black block, the initials shorthand is placed at the covering position in a white font, the number of hospitalization is reserved, and the patient is placed at the covering position, and the number of the hospital bed is completely covered.
104. Constructing a word replacement dictionary, listing candidate values which can be used as replacement values, randomly selecting the replacement values, and reordering according to a fixed rule;
in this embodiment, a text data desensitization module is specifically needed to be described, the text data desensitization module constructs a word replacement dictionary, uses corresponding words to replace sensitive information, transforms a sentence after replacement, rewrites a corresponding sentence, lists all candidate values which can be used as replacement values, randomly selects a candidate value from the list during replacement, selects the candidate value according to a certain rule, parameterizes the candidate value to be desensitized as an input of mapping of certain functions, transforms the data to obtain the desensitized data, performs conversion offset according to rules, performs rule rearrangement on the identification card number of a patient, rearranges the bits according to fixed rules, can recover the data according to inverse rules, creates a desensitization algorithm intermediate mapping table, wherein the mapping table field consists of an outpatient number, a mapping outpatient number and a mapping name, the mapping outpatient number is obtained by a mapping relation, consists of a fixed letter and a random number, the random number meets a non-repeated principle, the mapping number is obtained by a mapping relation, the mapping rule and the mapping number is the same as the mapping rule of the outpatient number and the mapping number is a common name and the mapping algorithm is not matched with the common name and has no mapping rule in the mapping algorithm.
105. Desensitizing the image data by an Outgusee algorithm;
in this embodiment, a specific description is provided of a picture data desensitizing module, where the picture data desensitizing module determines a main image position, reads information from an image in a database, prints an information tag name on the image, performs tag screening according to the tag name, operates on sensitive information obtained after screening, distributes the sensitive information of a patient image around the image, and performs steganography and reading through an Outgusee algorithm.
106. The consistency of the laboratory sheet and the patient is determined by carrying out joint verification through the data brevity code and the image desensitization;
in this embodiment, a specific description is provided of the data verification module, which performs three combined verification according to the data shortcode, the picture steganography and the abbreviation of the patient in the EDC, verifies the consistency of the laboratory sheet, and verifies the consistency of different laboratory sheets and medical history and patient information through the patient's inpatient number mantissa.
Finally: the foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (8)

1. A sensitive information desensitizing and identifying system, characterized by: the system comprises a patient sensitive information identification module, a patient sensitive information grading module, a privacy information definition and labeling module, a text data desensitizing module, an image data desensitizing module and a data verification module, wherein the patient sensitive information identification module establishes a discovery rule, inputs data and links with a database, the data are matched in the database through the discovery rule, the patient sensitive information grading module divides the data information into two different grades according to the sensitivity degree of patient information, the privacy information definition and labeling module carries out simple code labeling on basic information of a patient, the text data desensitizing module constructs a replacement dictionary, candidate values which can be replaced are added into the replacement dictionary, when data desensitization is carried out, a replacement value is randomly selected, reordering is carried out according to a fixed rule, the image data desensitizing module carries out desensitization processing on the image data through an algorithm, and the data verification module carries out joint verification through the data simple code and the image desensitization, and verifies the consistency of the different assimilation sheets and the patient information.
2. A sensitive information desensitizing and identifying system according to claim 1, wherein: the patient sensitive information identification module realizes the detection of connectivity of a data source through the link with the input data source in the patient data input stage to obtain local data and metadata, performs primary identification of user sensitive information on the obtained data, performs identification on the obtained data, performs sensitive information identification of a patient according to the type and the content of the data, adopts a sensitive information identification engine to realize the identification of the sensitive information of the patient, adopts a characteristic word extraction in a database, rules and natural language processing, stores data fields of the patient in an identification library of the sensitive fields, and mainly identifies long data fields in the task process of the sensitive information identification so as to improve the efficiency of the sensitive information identification and the sensitive information desensitization of the patient, analyzing all sentences in a database based on data recorded by a patient, identifying sensitive information in the sensitive information database, carrying out comparison analysis on field data in the database according to data characteristics configured in a discovery rule and combining field types with sample data to obtain the matching degree with the discovery rule, when the matching degree reaches a set threshold value, recognizing that the discovery rule is matched, the more the sample data is, the higher the identification rate of a sensitive object is, quickly combing metadata with privacy characteristics through sampling analysis on the data, automatically discovering the sensitive data, prompting the sensitive data including names, certificate numbers, bank accounts, addresses and telephone numbers, and directly entering configuration of a desensitization rule through a guiding mode.
3. A sensitive information desensitizing and identifying system according to claim 1, wherein: the patient sensitive information grading module is used for protecting the key points of the patient sensitive information in the data interaction process, and dividing the identified patient sensitive information into two grades according to the usability, the integrity and the confidentiality of the patient sensitive information, wherein the degree of potential threat and economic loss caused by relevant information leakage of the patient is required to be considered during the division, and the sensitivity grade division of the user sensitive information in the data interaction process is completed based on the consideration of the grade division key factors.
4. A sensitive information desensitizing and identifying system according to claim 1, wherein: the privacy information definition and labeling module is characterized in that the name, the hospitalization number and the sickbed number of a patient are all privacy information, when photographing and uploading are carried out, sensitive information is required to be covered, then uploading is carried out, the name of the patient is covered by a black block by adopting initials as shorthand codes, the initials shorthand codes are placed at a covering part in a white font, the hospitalization number is reserved, four positions are reserved, the patient is placed at the covering part, and the sickbed number is completely covered.
5. A sensitive information desensitizing and identifying system according to claim 1, wherein: the word data desensitization module constructs a word replacement dictionary, uses corresponding word replacement sensitive information to convert a replaced sentence, rewrites the corresponding sentence, and randomly selects a candidate value from the candidate values which can be used as replacement values when in replacement, selects the candidate value according to a certain rule, carries out parameterization substitution, uses data to be desensitized as the input of certain function mapping, carries out transformation and offset according to rules to obtain attribute values of numerical values, dates and time types, carries out rule rearrangement to the identification card number of a patient, rearranges the rules according to fixed rule pairs, can restore the data according to inverse rules, establishes a desensitization algorithm intermediate mapping table, wherein fields of the mapping table consist of clinic numbers, mapping clinic numbers and mapping names, the mapping clinic numbers are obtained through mapping relations, are obtained through mapping relations according to a fixed and random numbers, the random numbers meet the non-repeated principle, the mapping rule is the same as that the mapping rule of the mapping clinic numbers, when the mapping numbers are blank numbers, the mapping rule is the same as that the mapping rule of the mapping clinic numbers, the mapping rule is the common name is formed by the random mapping table, and the common name mapping algorithm is not repeated in the country mapping table, and the common name mapping algorithm is not repeated, and the common name mapping algorithm is formed in the common mapping algorithm.
6. A sensitive information desensitizing and identifying system according to claim 1, wherein: the picture data desensitization module determines the main image position, reads information of the images in the database, prints information label names of the images, performs label screening according to the label names, operates the screened sensitive information, distributes the sensitive information of the patient images around the images, and performs steganography and reading through an Outgasee algorithm.
7. A sensitive information desensitizing and identifying system according to claim 1, wherein: the data verification module performs three combined verification according to the data brief code, the picture steganography and the short name of the patient in the EDC, confirms the consistency of the laboratory sheets, and verifies the consistency of different laboratory sheets, medical history and patient information through the inpatient number mantissa of the patient.
8. A sensitive information desensitizing and identifying system according to claims 1-7, wherein: the specific operation steps are as follows:
s1, through linking input data and a database, according to data characteristics configured in a discovery rule, the higher the matching degree is, the higher the identification rate of a sensitive object is;
s2, dividing the data information into two different levels according to the availability, the integrity and the confidentiality of the sensitive information of the patient;
s3, labeling the name, the hospitalization number and the sickbed number of the patient by a brief code;
s4, constructing a word replacement dictionary, listing candidate values which can be used as replacement values, randomly selecting the replacement values, and reordering according to a fixed rule;
s5, desensitizing the image data through an Outgusee algorithm;
and S6, carrying out joint verification through data brevity codes and image desensitization, and determining the consistency of the laboratory sheet and the patient.
CN202310257477.7A 2023-03-16 2023-03-16 Sensitive information desensitization and recognition system Pending CN116502258A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310257477.7A CN116502258A (en) 2023-03-16 2023-03-16 Sensitive information desensitization and recognition system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310257477.7A CN116502258A (en) 2023-03-16 2023-03-16 Sensitive information desensitization and recognition system

Publications (1)

Publication Number Publication Date
CN116502258A true CN116502258A (en) 2023-07-28

Family

ID=87317388

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310257477.7A Pending CN116502258A (en) 2023-03-16 2023-03-16 Sensitive information desensitization and recognition system

Country Status (1)

Country Link
CN (1) CN116502258A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117349879A (en) * 2023-09-11 2024-01-05 江苏汉康东优信息技术有限公司 Text data anonymization privacy protection method based on continuous word bag model
CN118350050A (en) * 2024-06-12 2024-07-16 山东浪潮科学研究院有限公司 Data desensitizing method, device, electronic equipment, storage medium and computer program

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117349879A (en) * 2023-09-11 2024-01-05 江苏汉康东优信息技术有限公司 Text data anonymization privacy protection method based on continuous word bag model
CN118350050A (en) * 2024-06-12 2024-07-16 山东浪潮科学研究院有限公司 Data desensitizing method, device, electronic equipment, storage medium and computer program

Similar Documents

Publication Publication Date Title
CN116502258A (en) Sensitive information desensitization and recognition system
US20210327000A1 (en) Systems and methods for insurance fraud detection
US20240119177A1 (en) Method and system for automated text anonymisation
Beckwith et al. Development and evaluation of an open source software tool for deidentification of pathology reports
CN103294746B (en) For the method and system for going identification in visual media data
US8468167B2 (en) Automatic data validation and correction
CN105653984B (en) File fingerprint method of calibration and device
US9372916B2 (en) Document template auto discovery
US11120221B2 (en) Method and system to resolve ambiguities in regulations
US20140215301A1 (en) Document template auto discovery
CN112257446A (en) Named entity recognition method and device, computer equipment and readable storage medium
US20040162831A1 (en) Document handling system and method
US20230282322A1 (en) System and method for anonymizing medical records
CN112508145B (en) Electronic seal generation and verification method and device, electronic equipment and storage medium
EP4185984A1 (en) Classifying pharmacovigilance documents using image analysis
Mohammadi et al. Weakly supervised learning and interpretability for endometrial whole slide image diagnosis
CN113221762A (en) Cost balance decision method, insurance claim settlement decision method, device and equipment
CN116052848B (en) Data coding method and system for medical imaging quality control
CN116469526A (en) Training method, device, equipment and storage medium for traditional Chinese medicine diagnosis model
JP4679955B2 (en) Wound and disease name coding method and wound and disease name coding program
Mosquera-Zamudio et al. A Spitzoid Tumor dataset with clinical metadata and Whole Slide Images for Deep Learning models
JP2020140583A (en) Dictionary creation device, dictionary creation method, and dictionary creation program
CN113592523B (en) Financial data processing system and method
Etter et al. Project SEARCH (Scanning EARs for Child Health): validating an ear biometric tool for patient identification in Zambia
CN113947510A (en) Real estate electronic license management system based on file format self-adaptation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination