CN113139005A - Same-person identification method based on same-person identification model and related equipment - Google Patents

Same-person identification method based on same-person identification model and related equipment Download PDF

Info

Publication number
CN113139005A
CN113139005A CN202110433355.XA CN202110433355A CN113139005A CN 113139005 A CN113139005 A CN 113139005A CN 202110433355 A CN202110433355 A CN 202110433355A CN 113139005 A CN113139005 A CN 113139005A
Authority
CN
China
Prior art keywords
user
data
same
person
attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110433355.XA
Other languages
Chinese (zh)
Inventor
姚海莹
满晏松
贾声声
李苏南
柳恭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kangjian Information Technology Shenzhen Co Ltd
Original Assignee
Kangjian Information Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kangjian Information Technology Shenzhen Co Ltd filed Critical Kangjian Information Technology Shenzhen Co Ltd
Priority to CN202110433355.XA priority Critical patent/CN113139005A/en
Publication of CN113139005A publication Critical patent/CN113139005A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24564Applying rules; Deductive queries
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the field of artificial intelligence, and discloses a same-person identification method based on a same-person identification model and related equipment, wherein the method comprises the following steps: obtaining sample user data in each service system, and performing offline analysis on the sample user data to obtain user attribute data; extracting public attribute data of the user from the user attribute data and analyzing the public attribute data to obtain a same person identification rule; taking the same-person recognition rule and the public attribute data as training corpora, and training a preset recognition tool to obtain a same-person recognition model; and inputting the user data in each service system corresponding to the user to be identified into the same-person identification model for identification, and judging whether the user corresponding to each user data is the same person or not. The technical scheme of the invention realizes the identification of the same person of the user patient in the medical industry, facilitates the subsequent unified management and synchronization of the health data of the same patient, and improves the integrity and the authenticity of the health data of the user.

Description

Same-person identification method based on same-person identification model and related equipment
Technical Field
The invention relates to the field of artificial intelligence, in particular to a same-person identification method based on a same-person identification model and related equipment.
Background
With the advance of informatization construction of the medical industry in China, different medical service business systems are constructed at present, heterogeneous business systems are formed due to the fact that user information, data standards, information ranges and the like of all the business systems are different, and a plurality of pieces of patient information are generated on the same business system by the same patient. Different business behavior information of the same patient is discrete, the management of addition, deletion, modification and check cannot be uniformly carried out, and the user experience is inconsistent.
In the development process of medical services, the patient information of the business systems of the medical services needs to be managed in a unified manner, so that the integrity and the accuracy of the information are ensured. Therefore, how to identify all patient information of the same patient across the business systems of the medical services without affecting the business systems, perfecting the information of the same patient, and improving the information management efficiency and accuracy of the same user becomes a problem which needs to be solved urgently.
Disclosure of Invention
The invention mainly aims to solve the technical problems that the information management efficiency and the accuracy of the same user are low due to the fact that the same person identification cannot be carried out according to the user information of a plurality of users in the prior art.
The invention provides a same-person identification method based on a same-person identification model, which comprises the following steps: acquiring sample user data in each service system; performing off-line analysis on the sample user data to obtain analysis data, and extracting attribute data belonging to a preset attribute category from the analysis data as user attribute data; calculating the similarity of the attribute categories of each attribute data in the user attribute data to obtain a similarity value, and extracting public attribute data from the user attribute data based on the similarity value; comparing public attribute values in the public attribute data to obtain a comparison result, and analyzing the comparison result to obtain a same person identification rule; taking the same-person recognition rule and the public attribute data as training corpora, and training a preset recognition tool to obtain a same-person recognition model; and searching user data corresponding to the user to be identified in different service systems, inputting the user data into the same-person identification model for identification, and judging whether the users corresponding to the user data in the different service systems are the same person or not based on the identification result.
Optionally, in a first implementation manner of the first aspect of the present invention, the performing offline analysis on the sample user data to obtain analysis data, and extracting attribute data belonging to a preset attribute category from the analysis data as user attribute data includes: performing off-line analysis on the sample user data based on a preset data analysis rule to obtain analysis data; extracting attribute feature information of each attribute data from the analyzed data, and calculating semantic similarity between the attribute feature information and a preset attribute category to obtain a first similarity value; and comparing the first similarity value with a preset similarity threshold, and if the first similarity value is not less than the preset similarity threshold, extracting corresponding attribute data from the analysis data as user attribute data.
Optionally, in a second implementation manner of the first aspect of the present invention, the calculating a similarity between attribute categories of each attribute data in the user attribute data to obtain a similarity value, and extracting common attribute data from the user attribute data based on the similarity value includes: in a semantic space, calculating semantic similarity of attribute categories of each attribute data in the user attribute data to obtain a second similarity value; and comparing the second similarity value with a preset similarity threshold, and if the second similarity value is not less than the preset similarity threshold, extracting corresponding attribute data from the user attribute data as public attribute data.
Optionally, in a third implementation manner of the first aspect of the present invention, the searching, in the different service systems, for user data corresponding to a user to be identified, inputting the user data into the same-person identification model for identification, and determining, based on an identification result, whether the users corresponding to the user data in the different service systems are the same person includes: acquiring user data of a user to be identified, and extracting a user account in the user data; searching user information data and first patient information data corresponding to the user account from user data, wherein the user information data at least comprises a user identification number, a user certificate number and user basic identity information, and the first patient information data at least comprises a first patient identification number, a first patient certificate number and first patient basic identity information; and calling the same-person identification model, and analyzing the user information data and the patient information data belonging to the same user account to obtain a result of whether the user and the first patient are the same person.
Optionally, in a fourth implementation manner of the first aspect of the present invention, the invoking the same-person identification model, and analyzing the user information data and the patient information data belonging to the same user account to obtain a result of whether the user and the first patient are the same person includes: calling the same-person identification model, and comparing whether the user identification number belonging to the same user account is consistent with the first patient identification number to obtain a first comparison result; comparing whether the user certificate number belonging to the same certificate type under the same user account number is consistent with the first patient certificate number or not to obtain a second comparison result; comparing whether the basic identity information of the user under the same account number is consistent with the basic identity information of the first patient or not to obtain a third comparison result, wherein the basic identity information at least comprises name, gender and date of birth; determining that the user is the same person as the first patient when at least one of the first comparison result, the second comparison result and the third comparison result is consistent.
Optionally, in a fifth implementation manner of the first aspect of the present invention, the searching, in the different service systems, for user data corresponding to a user to be identified, inputting the user data into the same-person identification model for identification, and determining, based on an identification result, whether the users corresponding to the user data in the different service systems are the same person includes: extracting second patient information data and third patient information data which belong to the same user account in the user data of the user to be identified; comparing whether the second patient information data is consistent with the third patient information data to obtain a comparison result; and calling the same-person identification model, and analyzing the comparison result to obtain a result of whether the second patient and the third patient are the same person.
Optionally, in a sixth implementation manner of the first aspect of the present invention, after searching for user data corresponding to a user to be identified in different service systems, inputting the user data into the same-person identification model for identification, and determining whether users corresponding to the user data in different service systems are the same person based on an identification result, the method further includes: using the user data subjected to the same-person recognition as a secondary training corpus; and carrying out secondary training on the same-person recognition model based on the secondary training corpus to obtain the same-person recognition model after secondary training.
A second aspect of the present invention provides a device for identifying a person, including: the acquisition module is used for acquiring sample user data in each service system; the analysis module is used for carrying out off-line analysis on the sample user data to obtain analysis data, and extracting attribute data belonging to a preset attribute class from the analysis data to serve as user attribute data; the calculation module is used for calculating the similarity of the attribute categories of the attribute data in the user attribute data to obtain a similarity value, and extracting public attribute data from the user attribute data based on the similarity value; the comparison module is used for comparing public attribute values in the public attribute data to obtain a comparison result, and analyzing the comparison result to obtain the identity recognition rule; the training module is used for training a preset recognition tool by taking the same-person recognition rule and the public attribute data as training corpora to obtain a same-person recognition model; and the identification module is used for searching user data corresponding to the user to be identified, inputting the user data into the same-person identification model for identification, and judging whether the users corresponding to the user data in different service systems are the same person or not based on the identification result.
Optionally, in a first implementation manner of the second aspect of the present invention, the parsing module is specifically configured to: performing off-line analysis on the sample user data based on a preset data analysis rule to obtain analysis data; extracting attribute feature information of each attribute data from the analyzed data, and calculating semantic similarity between the attribute feature information and a preset attribute category to obtain a first similarity value; and comparing the first similarity value with a preset similarity threshold, and if the first similarity value is not less than the preset similarity threshold, extracting corresponding attribute data from the analysis data as user attribute data.
Optionally, in a second implementation manner of the second aspect of the present invention, the calculation module is specifically configured to: in a semantic space, calculating semantic similarity of attribute categories of each attribute data in the user attribute data to obtain a second similarity value; and comparing the second similarity value with a preset similarity threshold, and if the second similarity value is not less than the preset similarity threshold, extracting corresponding attribute data from the user attribute data as public attribute data.
Optionally, in a third implementation manner of the second aspect of the present invention, the identification module includes: the system comprises an extraction unit, a recognition unit and a recognition unit, wherein the extraction unit is used for acquiring user data of a user to be recognized and extracting a user account in the user data; the searching unit is used for searching user information data and first patient information data corresponding to the user account from the user data, wherein the user information data at least comprises a user identification number, a user certificate number and user basic identity information, and the first patient information data at least comprises a first patient identification number, a first patient certificate number and first patient basic identity information; and the analysis unit is used for calling the same-person identification model, analyzing the user information data and the patient information data belonging to the same user account, and obtaining a result of whether the user and the first patient are the same person.
Optionally, in a fourth implementation manner of the second aspect of the present invention, the analysis unit is specifically configured to: calling the same-person identification model, and comparing whether the user identification number belonging to the same user account is consistent with the first patient identification number to obtain a first comparison result; comparing whether the user certificate number belonging to the same certificate type under the same user account number is consistent with the first patient certificate number or not to obtain a second comparison result; comparing whether the basic identity information of the user under the same account number is consistent with the basic identity information of the first patient or not to obtain a third comparison result, wherein the basic identity information at least comprises name, gender and date of birth; determining that the user is the same person as the first patient when at least one of the first comparison result, the second comparison result and the third comparison result is consistent.
Optionally, in a fifth implementation manner of the second aspect of the present invention, the identification module is specifically configured to: extracting second patient information data and third patient information data which belong to the same user account in the user data of the user to be identified; comparing whether the second patient information data is consistent with the third patient information data to obtain a comparison result; and calling the same-person identification model, and analyzing the comparison result to obtain a result of whether the second patient and the third patient are the same person.
Optionally, in a sixth implementation manner of the second aspect of the present invention, the same-person recognition apparatus further includes a secondary training module, where the secondary training module is specifically configured to: using the user data subjected to the same-person recognition as a secondary training corpus; and carrying out secondary training on the same-person recognition model based on the secondary training corpus to obtain the same-person recognition model after secondary training.
A third aspect of the present invention provides a same-person recognition apparatus comprising: a memory having instructions stored therein and at least one processor, the memory and the at least one processor interconnected by a line; the at least one processor invokes the instructions in the memory to cause the homo-recognition device to perform the steps of the homo-recognition method based on a homo-recognition model described above.
A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon instructions, which, when run on a computer, cause the computer to perform the steps of the above-mentioned homo-recognition method based on a homo-recognition model.
In the technical scheme provided by the invention, sample user data of each service system is obtained, common attribute data in the sample user data is extracted and analyzed to obtain a same-person identification rule, then the same-person identification rule and the common attribute data are used as training corpora to train to obtain a same-person identification model, user data in each service system corresponding to a user to be identified is input into the same-person identification model to be identified, and whether the user corresponding to each user data is the same person is judged. According to the technical scheme, the one-person identification model is constructed to identify the patient in the medical field, so that the management efficiency and the information accuracy of the information management of the patient in the medical field are improved.
Drawings
FIG. 1 is a diagram of a first embodiment of a method for identifying a person based on a person identification model according to an embodiment of the present invention;
FIG. 2 is a diagram of a second embodiment of the method for identifying the same person based on the same person identification model according to the embodiment of the invention;
FIG. 3 is a diagram of a third embodiment of the same-person recognition method based on the same-person recognition model according to the embodiment of the invention;
FIG. 4 is a diagram of a fourth embodiment of the method for identifying the same person based on the same person identification model according to the embodiment of the invention;
FIG. 5 is a schematic diagram of an embodiment of a peer identification device in accordance with an embodiment of the invention;
FIG. 6 is a schematic diagram of another embodiment of the same person identification device in the embodiment of the invention;
fig. 7 is a schematic diagram of an embodiment of the same person identification device in the embodiment of the present invention.
Detailed Description
The embodiment of the invention provides a same-person identification method and related equipment based on a same-person identification model, which are used for obtaining sample user data in each service system and performing off-line analysis on the sample user data to obtain user attribute data; extracting public attribute data of the user from the user attribute data and analyzing the public attribute data to obtain a same person identification rule; taking the same-person recognition rule and the public attribute data as training corpora, and training a preset recognition tool to obtain a same-person recognition model; and inputting the user data in each service system corresponding to the user to be identified into the same-person identification model for identification, and judging whether the user corresponding to each user data is the same person or not. The technical scheme of the embodiment realizes the identification of the same person of the user patient in the medical industry, facilitates the subsequent unified management and synchronization of the health data of the same patient, and improves the integrity and the truth of the health data of the user.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," or "having," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
For the sake of understanding, the following describes specific contents of an embodiment of the present invention, and referring to fig. 1, a first embodiment of a method for identifying a person based on a person identification model according to an embodiment of the present invention includes:
101, obtaining sample user data in each service system;
with the development of business in the medical industry, a considerable number of businesses have built respective business systems. The business systems are designed by different product managers and built and realized by different technical teams, and because the user journey, the data standard, the information range and the like of each business are different, a heterogeneous business system is formed, and a plurality of patient identification numbers can be generated on the same client by the same patient. Different business behavior information of the same patient is discrete, the management of addition, deletion, modification and check cannot be uniformly carried out, and the user experience is inconsistent. The patient information between the services has cross reference, complete isolation and partial cross reference, and meanwhile, the patient information standards and content range pages of all the service systems are different, and the information integrity is also lost to different degrees. In addition, the problems commonly existing in internet users exist, most information is actively input by the users, real-name verification is not carried out, the information credibility is unknown, and the like.
In this case, in order to facilitate tracking management of health information for the same patient, it is necessary to collect service systems corresponding to different service scenarios, and integrate, as sample user data, related information data authorized by the user and not belonging to user privacy data in each service system. Specifically, each service system may be an online inquiry service system, an external doctor inquiry service system, a physical examination service system, a registration service system, an electronic prescription service system, an enterprise user system, a private doctor service system, and the like. In addition, since a unified user identification number is obtained after the user is registered in each service system, and then the user identification number can be used for not only self-recommended use for the user, but also inquiry, scheduled physical examination or other services for family or friends of the user, the sample user data may include information data of multiple persons.
102, performing off-line analysis on the sample user data to obtain analysis data, and extracting attribute data belonging to a preset attribute category from the analysis data as user attribute data;
the method comprises the steps of presetting attribute categories of user data, carrying out off-line analysis on sample user data, calling a data analysis tool to carry out off-line data analysis on the sample user data corresponding to a sample user in each service system through a BI big data technology, classifying the sample user data according to data analysis results and preset attribute categories of the user data, and extracting effective attribute data from the sample user data to be used as the user attribute data.
Specifically, according to preset attribute types, the attribute data of the sample user corresponding to the preset attribute types is extracted from the obtained analysis data, so as to obtain the user attribute data. In the process, the preset attribute category is that information data which can be used for verifying whether the sample user data is the same person or not is selected according to the obtained sample user data, the information data are classified to form the preset attribute category, and then the attribute data which belongs to the attribute category is selected from the sample user data according to the attribute category, so that the user attribute data are obtained, the user attribute data are the user attribute data of a plurality of sample users, and the user data of each sample user cannot guarantee the integrity and the effectiveness of the user attribute data, so that the attribute categories corresponding to the user attribute data of each sample user are possibly not completely consistent, namely, one sample has one attribute data in the attribute category, but another sample user does not have the attribute data of the same attribute category.
103, calculating the similarity of the attribute categories of each attribute data in the user attribute data to obtain a similarity value, and extracting public attribute data from the user attribute data based on the similarity value;
extracting attribute data capable of reflecting the public attributes of the same person from the obtained user attribute data, namely extracting user attribute data with the same attribute type from the user attribute data of all sample users, wherein the currently mainly extracted public attributes are as follows: user identification number, patient identification number, name, gender, date of birth, certificate type, certificate number, and the like.
Specifically, whether the obtained user attribute data has the same user attribute data or not is compared, that is, whether the attribute categories corresponding to the user attribute data are consistent or not is compared. Furthermore, the similarity of the attribute categories corresponding to the user attribute data of all the sample users can be calculated, and the similarity value is used as a basis for judging whether the attribute categories are the same or not. And when the similarity value is not less than the preset similarity threshold value, the attribute categories corresponding to the attribute data are considered to be consistent.
104, comparing public attribute values in the public attribute data to obtain a comparison result, and analyzing the comparison result to obtain a same person identification rule;
and analyzing the extracted public attribute data, summarizing and inducing to obtain the same-person identification rule according to the analysis result, namely comparing and analyzing each public attribute value in the public attribute data of each sample user, judging whether each sample user is the same person or not, and summarizing the judgment process to be the same-person identification rule. Specifically, the service modeling is performed by using the public attributes of the sample user, and whether the public attribute values such as the user identification number, the patient identification number, the name, the gender, the birth date, the certificate type, the certificate number and the like are consistent or not is compared. The comparison process is that whether the user identification number is consistent with the patient identification number is compared under the user account number of the same sample user; correspondingly comparing whether the user name, the user gender and the user birth date are consistent with the patient name, the patient gender and the patient birth date; and correspondingly comparing whether the certificate type and the certificate number of the user are consistent with the certificate type and the certificate number of the patient. According to the three comparisons, the analysis result shows that the identification rule of the same person is that the same person is identified by the identification number, the certificate is identified by the certificate type and the certificate number, and the name, the gender and the birth date are identical.
105, taking the same-person recognition rule and the public attribute data as training corpora, and training a preset recognition tool to obtain a same-person recognition model;
and training a preset recognition tool by using the obtained identity recognition rule and the public attribute data as training corpora to obtain an identity recognition model. In the training process, the public attribute data of all sample users are input as parameters, pairwise recognition is carried out in the recognition tool according to the same-person recognition rule, the recognition parameters of the recognition tool are trained, and a final same-person recognition model is obtained.
And 106, searching user data corresponding to the user to be identified in different business systems, inputting the user data into the same-person identification model for identification, and judging whether the users corresponding to the user data in the different business systems are the same person or not based on the identification result.
Searching user data corresponding to a user to be identified in different business systems, calling a generated same-person identification model, inputting the user data into the same-person identification model, carrying out same-person identification on the user to be identified, extracting public attribute data in the user data of the user to be identified according to a method for extracting public attribute data of a sample user in the process of carrying out same-person identification, then inputting the public attribute data of the user to be identified into the same-person identification model, and carrying out pairwise identification on the public attribute data of the user to be identified by the same-person identification model to obtain an identification result. Or all the extracted public attribute data can be input into the same-person identification model, and the same-person identification model is combined pairwise to carry out same-person identification. The user data of the user to be identified can be directly input into the same-person identification model, the same-person identification model can analyze the user data, and then the public attribute data is extracted for same-person identification.
In addition, the same person after being identified by the same person identification model can automatically obtain an independent and unique health identification number, under the condition of not interfering each business system, the patient identity manager can be independently constructed through the same person identification model, the health identification number of each patient is uniformly managed, all health archive information generated in different business scenes can be uniformly inquired and maintained by using the health identification number, cross reference in different business systems is realized, and the health archive management system has good practical value. The trained one-person recognition model can be independently extracted from the business, so that the patient identities of different business systems are recognized and combined and are cross-referenced by the different business systems. The patient identity manager engine is realized through the same-person recognition model, and a uniform user health identification number is maintained. Different service systems are called on line in real time and cross-referenced through the health identification numbers of the users, and information generated by each patient in the different service systems is supplemented, integrated and perfected through a patient identity manager engine or is directly inquired and referenced. Based on the same-person identification model, an online health file with comprehensive integration of personal medical information of a patient can be formed, and the personal health data can be returned to the person in an effort to form a health file with the person as the center. Once the identity information sources of a plurality of service systems generate new information, the identity recognition model receives the information in a unified manner and registers the information; when the patient information of a certain business system is changed, the same-person identification model can inform other business systems; the same-person identification model carries out corresponding conversion between the patient identities of all the service systems, so that each service system can realize the communication between the service systems related to the patient identities only by maintaining the patient identification numbers of the service systems.
In the embodiment of the invention, the same-person identification model is constructed to identify the same person of the user to be identified by acquiring the sample user data and the same-person identification rule as training corpora. The technical scheme of the embodiment realizes the identification of the same person of the user patient in the medical industry, facilitates the subsequent unified management and synchronization of the health data of the same patient, and improves the integrity and the truth of the health data of the user.
Referring to fig. 2, a second embodiment of the method for identifying a person based on a person identification model according to the embodiment of the present invention includes:
201, obtaining sample user data in each service system;
with the development of business in the medical industry, a considerable number of businesses have built respective business systems. The business systems are designed by different product managers and built and realized by different technical teams, and because the user journey, the data standard, the information range and the like of each business are different, a heterogeneous business system is formed, and a plurality of patient identification numbers can be generated on the same client by the same patient. Different business behavior information of the same patient is discrete, the management of addition, deletion, modification and check cannot be uniformly carried out, and the user experience is inconsistent. The patient information between the services has cross reference, complete isolation and partial cross reference, and meanwhile, the patient information standards and content range pages of all the service systems are different, and the information integrity is also lost to different degrees. In addition, the problems commonly existing in internet users exist, most information is actively input by the users, real-name verification is not carried out, the information credibility is unknown, and the like.
In this case, in order to facilitate tracking management of health information for the same patient, it is necessary to collect service systems corresponding to different service scenarios, and integrate, as sample user data, related information data authorized by the user and not belonging to user privacy data in each service system. Specifically, each service system may be an online inquiry service system, an external doctor inquiry service system, a physical examination service system, a registration service system, an electronic prescription service system, an enterprise user system, a private doctor service system, and the like. In addition, since a unified user identification number is obtained after the user is registered in each service system, and then the user identification number can be used for not only self-recommended use for the user, but also inquiry, scheduled physical examination or other services for family or friends of the user, the sample user data may include information data of multiple persons.
202, performing offline analysis on sample user data based on a preset data analysis rule to obtain analysis data;
and (3) analyzing the sample data in an off-line manner based on a preset data analysis rule, namely, analyzing historical data related to the user patient in each business system in an off-line manner by a BI big data technology. The BI big data technology is business intelligence, also called business intelligence or business intelligence, and means that a modern data warehouse technology, an online analysis processing technology, a data mining and data presentation technology are used for data analysis to achieve business value.
The key of business intelligence is to extract useful data from many data from different enterprise operating systems and clean the data to ensure the correctness of the data, then merge the data into an enterprise-level data warehouse through Extraction (Extraction), Transformation (Transformation) and loading (Load), i.e. ETL process, so as to obtain a global view of the enterprise data, analyze and process the data on the basis by using a proper query and analysis tool, a data mining tool (big data magic mirror), an OLAP tool and the like (at this time, information becomes knowledge for assisting decision making), and finally present the knowledge to a manager to support the decision making process of the manager.
In the process of performing offline analysis on sample user data through the BI technology, an analysis tool is called to perform offline analysis on the sample user data, and in this step, the analysis tool is not limited, and analysis data can be obtained after offline analysis.
203, extracting attribute feature information of each attribute data from the analyzed data, and calculating semantic similarity between the attribute feature information and a preset attribute category to obtain a first similarity value;
after the sample data is analyzed, the obtained analyzed data contains the attribute data of all sample users, each attribute data carries corresponding attribute feature information, and the attribute category corresponding to the attribute data can be identified according to the attribute feature information.
Specifically, the attribute feature information carried by each attribute data is extracted from the analysis data, and semantic similarity comparison is performed on the attribute feature information and the preset attribute category in a semantic space, that is, the semantic similarity between the attribute feature information and the preset attribute category is calculated. In the calculation process, firstly, a preset semantic recognition tool is required to be called to carry out semantic recognition on attribute feature information and preset attribute categories, then, according to the existing semantic similarity calculation method, semantic similarity calculation is carried out on the semantics of the recognized attribute feature information and the semantics of the preset attribute categories, and the calculation result is used as a first similarity value. In this step, the semantic similarity calculation belongs to the prior art, and is not described herein.
204, comparing the first similarity value with a preset similarity threshold, and if the first similarity value is not less than the preset similarity threshold, extracting corresponding attribute data from the analysis data as user attribute data;
and obtaining a first similarity value through semantic similarity calculation, comparing the first similarity value with a preset similarity threshold value, namely comparing the numerical value of the first similarity value with the preset similarity threshold value, when the first similarity value is not less than the preset similarity threshold value, indicating that the attribute feature information of the attribute data subjected to calculation is consistent with the semantics of the attribute class, namely the attribute data belongs to the attribute class, and when the first similarity value is less than the preset similarity threshold value, indicating that the attribute data is not matched with the attribute class, thereby determining the attribute class of each attribute data. Extracting attribute feature information corresponding to all attribute data, calculating semantic similarity between each attribute feature information and a preset attribute category, comparing a calculated similarity value with a preset similarity threshold value to determine the attribute category of each attribute data, and extracting each attribute data belonging to all preset attribute categories from the analysis data as user attribute data.
205, in the semantic space, calculating semantic similarity of attribute categories of each attribute data in the user attribute data to obtain a second similarity value;
and comparing whether the obtained user attribute data has the same user attribute data, namely comparing whether the attribute categories corresponding to the user attribute data are consistent. Specifically, the similarity of the attribute categories corresponding to the user attribute data of all the sample users can be calculated, and the similarity value is used as a basis for judging whether the attribute categories are the same.
Furthermore, in the semantic space, semantic similarity analysis is carried out on the attribute categories of each attribute data in the user attribute data, a preset semantic recognition tool is called to carry out semantic recognition on each attribute category, then semantic similarity calculation is carried out on the semantics of the attribute category corresponding to each identified attribute data according to the existing semantic similarity calculation method, and the calculation result is used as a second similarity value.
206, comparing the second similarity value with a preset similarity threshold, and if the second similarity value is not less than the preset similarity threshold, extracting corresponding attribute data from the user attribute data as public attribute data;
if the same user attribute data exists, that is, the user attribute data of a plurality of sample users belong to the same attribute class, the user attribute data are used as the common attribute data.
And comparing the second similarity value with a preset similarity threshold, namely comparing the numerical values of the second similarity value with the preset similarity threshold, and when the second similarity value is not less than the preset similarity threshold, indicating that the attribute types of the corresponding attribute data are consistent, and using the attribute data as public attribute data, wherein the public attribute data indicates the public attribute information of each sample user. In addition, when the second similarity value is smaller than the preset similarity threshold, it indicates that the attribute types corresponding to the compared attribute data are inconsistent, that is, the attribute data do not belong to the same attribute type.
Further, if the user attribute data of all the sample users has attribute data belonging to the identification number attribute category, the identification number is taken as common attribute data. The public attribute data can be limited to attribute data such as identification numbers, names, sexes, birth dates, certificate types, certificate numbers and the like.
207, comparing the public attribute values in the public attribute data to obtain a comparison result, and analyzing the comparison result to obtain the identity recognition rule;
and analyzing the extracted public attribute data, summarizing and inducing to obtain the same-person identification rule according to the analysis result, namely comparing and analyzing each public attribute value in the public attribute data of each sample user, judging whether each sample user is the same person or not, and summarizing the judgment process to be the same-person identification rule. Specifically, the service modeling is performed by using the public attributes of the sample user, and whether the public attribute values such as the user identification number, the patient identification number, the name, the gender, the birth date, the certificate type, the certificate number and the like are consistent or not is compared. The comparison process is that whether the user identification number is consistent with the patient identification number is compared under the user account number of the same sample user; correspondingly comparing whether the user name, the user gender and the user birth date are consistent with the patient name, the patient gender and the patient birth date; and correspondingly comparing whether the certificate type and the certificate number of the user are consistent with the certificate type and the certificate number of the patient. According to the three comparisons, the analysis result shows that the identification rule of the same person is that the same person is identified by the identification number, the certificate is identified by the certificate type and the certificate number, and the name, the gender and the birth date are identical.
208, taking the same-person recognition rule and the public attribute data as training corpora, and training a preset recognition tool to obtain a same-person recognition model;
and training a preset recognition tool by using the obtained identity recognition rule and the public attribute data as training corpora to obtain an identity recognition model. In the training process, the public attribute data of all sample users are input as parameters, pairwise recognition is carried out in the recognition tool according to the same-person recognition rule, the recognition parameters of the recognition tool are trained, and a final same-person recognition model is obtained.
209, acquiring user data of a user to be identified, and extracting a user account in the user data;
the method comprises the steps of acquiring corresponding user data which are authorized and available for use by users from various medical service systems according to the users to be identified, and extracting user accounts from the user data, wherein the user data at least comprise the user accounts, user information data corresponding to the user accounts and patient information data, one user data only comprises one user account and one user information data, and the number of the patient information data is at least one, namely, the information data of a plurality of patients can exist under the same user account.
210, searching user information data and first patient information data corresponding to a user account from the user data;
and extracting user information data and patient information data under the same user account from all the obtained user accounts, and taking the patient information data as first patient information data. The extracted patient information data under the user account can be limited to extracting information data of one patient or extracting information data of a plurality of patients, the extracted user information data at least comprises a user identification number, a user certificate number and user basic identity information, and the first patient information data at least comprises a first patient identification number, a first patient certificate number and first patient basic identity information.
And 211, calling a same-person identification model, and analyzing the user information data and the patient information data belonging to the same user account to obtain a result of whether the user and the first patient are the same person.
Calling a constructed one-person identification model, inputting user information data and patient information data belonging to the same user account into the one-person identification model, comparing and analyzing each data in the user information data and the patient information data through the one-person identification model, judging whether the user and the patient are the same person or not, and obtaining the identification result of whether the user and the patient are the same person or not, wherein the patient information data input into the one-person identification model can be limited to input the information data of one patient or can be limited to input the information data of a plurality of patients, when the input is the information data of a plurality of patients, the model randomly selects one patient information data from the input patient information data to compare and analyze with the user information data until all the patient information data in the model are respectively compared and analyzed with the user information data, and then outputting the result of the same person recognition according to the analysis result.
The method comprises the steps of extracting corresponding user identification numbers and first patient identification numbers from user information data and information data of a first patient respectively, comparing the user identification numbers belonging to the same user account with the first patient identification numbers, comparing whether the identification numbers are consistent, comparing whether the component structures of all the identification numbers are consistent according to identification number generation rules in a service system in the comparison process of the identification numbers, and comparing whether the contents in the component structures of all the identification numbers are consistent to obtain a first comparison result.
And comparing whether the user certificate number belonging to the same certificate type under the same user account number is consistent with the first patient certificate number. Specifically, corresponding user certificate numbers and first patient certificate numbers are respectively extracted from user information data and information data of a first patient, the user certificate numbers of the same certificate type under the same user account number are compared with the first patient certificate numbers, whether the certificate numbers are consistent or not is compared, in the comparison process of the certificate numbers, the same processing procedure as that of identification numbers is compared, according to the certificate number generation rules of the certificate types, whether all component structures of all certificate numbers are consistent or not is compared, and whether contents in all component structures are consistent or not is compared. The certificate type can be defined as an identity card, the certificate number is an identity card number, and a second comparison result is obtained.
And comparing whether the basic identity information of the user under the same account is consistent with the basic identity information of the first patient, wherein the basic identity information at least comprises name, gender and date of birth. Specifically, the basic identity information of the corresponding user and the basic identity information of the first patient are respectively extracted from the user information data and the information data of the first patient, and the basic identity information of the user who belongs to the same user account is compared with the basic identity information of the first patient, wherein the basic identity information at least comprises name, gender and date of birth. When the basic identity information is compared, the basic identity information is compared with each information item in the basic identity information one by one, namely, the name of the user is compared with the name of the patient, the gender of the user is compared with the gender of the patient, the birth date of the user is compared with the birth date of the patient, the three comparison results are combined and analyzed to obtain the comparison result of the basic identity information, and a third comparison result is obtained.
And combining the comparison results obtained in the three comparison steps, and analyzing the comparison results to obtain the recognition result of the same-person recognition model. And if none of the first comparison result, the second comparison result and the third comparison result is consistent, the same-person identification model determines that the user under the user account is the same person as the patient identified by the model.
In the embodiment of the invention, the created same-person identification model is used for carrying out same-person identification according to the user information data and the patient information data which belong to the same user account, so that the accuracy of same-person identification is improved, meanwhile, the follow-up management of the information data which belong to the same account and are the same as the user and the patient is facilitated, and the management efficiency is improved.
Referring to fig. 3, a third embodiment of the method for identifying a same person based on a same person identification model according to the embodiment of the present invention includes:
301, obtaining sample user data in each service system;
302, performing offline analysis on sample user data based on a preset data analysis rule to obtain analysis data;
303, extracting attribute feature information of each attribute data from the analyzed data, and calculating semantic similarity between the attribute feature information and a preset attribute category to obtain a first similarity value;
304, comparing the first similarity value with a preset similarity threshold, and if the first similarity value is not less than the preset similarity threshold, extracting corresponding attribute data from the analysis data as user attribute data;
305, calculating semantic similarity of attribute categories of each attribute data in the user attribute data in a semantic space to obtain a second similarity value;
306, comparing the second similarity value with a preset similarity threshold, and if the second similarity value is not less than the preset similarity threshold, extracting corresponding attribute data from the user attribute data as public attribute data;
307, comparing public attribute values in the public attribute data to obtain a comparison result, and analyzing the comparison result to obtain a same person identification rule;
308, taking the same-person recognition rule and the public attribute data as training corpora, and training a preset recognition tool to obtain a same-person recognition model;
309, extracting second patient information data and third patient information data which belong to the same user account in the user data of the user to be identified;
according to a user to be identified, corresponding user data which is authorized and available for use by the user is obtained from each medical service system, a plurality of patient information data which belong to the same user account are extracted from the user data, the identification can be limited to that two pieces of patient information data are randomly selected from the patient information data to carry out the identification of the same person, namely that a second piece of patient information data and a third piece of information data are selected to carry out the identification of the same person, wherein the second piece of patient information data at least comprises a second patient identification number, a second patient certificate number and second patient basic identity information, and the third piece of patient information data at least comprises a third patient identification number, a third patient certificate number and third patient basic identity information.
310, comparing whether the second patient information data is consistent with the third patient information data to obtain a comparison result;
and comparing whether the second patient information data is consistent with the third patient information data, specifically, respectively extracting corresponding second patient identification numbers and third patient identification numbers from the second patient information data and the third patient information data, comparing the second patient identification numbers and the third patient identification numbers belonging to the same user account, comparing whether the identification numbers are consistent, generating rules according to the identification numbers in the service system during the comparison of the identification numbers, comparing whether the composition structures of all the parts of the identification numbers are consistent, and comparing whether the contents in the composition structures of all the parts are consistent to obtain an identification number comparison result.
The method comprises the steps of extracting corresponding second patient certificate numbers and third patient certificate numbers from second patient information data and third patient information data respectively, comparing the second patient certificate numbers and the third patient certificate numbers of the same certificate type under the same user account, comparing whether the certificate numbers are consistent, in the comparison process of the certificate numbers, the same as the processing process of comparing identification numbers, generating rules according to the certificate numbers of the certificate types, comparing whether the composition structures of all parts of each certificate number are consistent, and comparing whether the contents in the composition structures of all parts are consistent. The certificate type can be defined as an identity card, the certificate number is an identity card number, and a certificate number comparison result is obtained.
And extracting corresponding basic identity information of the second patient and basic identity information of the third patient from the information data of the second patient and the information data of the third patient respectively, and comparing the basic identity information of the second patient and the basic identity information of the third patient which belong to the same user account, wherein the basic identity information at least comprises name, gender and date of birth. When the basic identity information is compared, the basic identity information is compared one by one according to all information items in the basic identity information, namely the name of a second patient is compared with the name of a third patient, the gender of the second patient is compared with the gender of the third patient, the birth date of the second patient is compared with the birth date of the third patient, the three comparison results are combined and analyzed to obtain the comparison result of the basic identity information, wherein the comparison result of the basic identity information can be determined to be consistent only when the three comparison results in the step are consistent.
And 311, calling the identification model of the same person, comparing and analyzing the result to obtain the result of whether the second patient and the third patient are the same person.
And combining the comparison results obtained in the three comparison steps, and analyzing the comparison results to obtain the recognition result of the same-person recognition model. In the three comparison steps, as long as the comparison result obtained in one comparison step is consistent, that is, when the comparison result of any one of the identification number comparison result, the certificate number comparison result and the basic identity information is consistent, the one-person identification model determines that the second patient and the third patient under the user account are the same person, and if the comparison result of any one of the identification number comparison result, the certificate number comparison result and the basic identity information is not consistent, it is determined that the second patient and the third patient under the user account are not the same person.
In the embodiment of the present invention, the steps 301-308 are the same as the steps 201-208 in the second embodiment of the same-person identification method based on the same-person identification model, and are not described herein again.
In the embodiment of the invention, the constructed same-person identification model is used for identifying the same person of each patient according to the information data of each patient belonging to the same account, so that the accuracy of the same-person identification is improved, and the information of each patient under the same account can be conveniently managed subsequently.
Referring to fig. 4, a fourth embodiment of the method for identifying the same person based on the same person identification model according to the embodiment of the present invention includes:
401, obtaining sample user data in each service system;
402, performing offline analysis on the sample user data to obtain analysis data, and extracting attribute data belonging to a preset attribute category from the analysis data as user attribute data;
403, calculating similarity of attribute categories of each attribute data in the user attribute data to obtain a similarity value, and extracting public attribute data from the user attribute data based on the similarity value;
404, comparing public attribute values in the public attribute data to obtain a comparison result, and analyzing the comparison result to obtain a same person identification rule;
405, taking the same-person recognition rule and the public attribute data as training corpora, and training a preset recognition tool to obtain a same-person recognition model;
406, searching user data corresponding to the user to be identified in different service systems, inputting the user data into the same-person identification model for identification, and judging whether the users corresponding to the user data in the different service systems are the same person based on the identification result;
407, taking the user data subjected to the same-person recognition as secondary training corpora;
after calling the same-person recognition model to perform same-person recognition on the user to be recognized, user data after the same-person recognition can be obtained, and the user data is used as a secondary training corpus and is used for training the built same-person recognition model, so that the model precision is improved. The secondary corpus may be obtained by using all the identified user data as the corpus, or extracting the user data of the same person as the identification result from the identified user data as the corpus.
And 408, performing secondary training on the same-person recognition model based on the secondary training corpus to obtain the same-person recognition model after secondary training.
According to the obtained secondary training corpus, the same-person recognition model is subjected to secondary training, in the secondary training process, the secondary training corpus can be combined with the previous training corpus to train the same-person recognition model, the same-person recognition model can also be directly trained by the secondary training corpus, after the secondary training is finished, the obtained same-person recognition model is high in recognition precision, and the recognition accuracy and the recognition efficiency are improved.
In the present embodiment, the step 401-.
In the embodiment of the invention, the data which is subjected to the same-person recognition is used as the training corpus of the secondary training, and the constructed same-person recognition model is subjected to the secondary training, so that the precision of the recognition parameters of the same-person recognition model is improved, and the accuracy of the recognition result is improved.
With reference to fig. 5, the method for identifying a same person based on a same person identification model in the embodiment of the present invention is described above, and a device for identifying a same person in the embodiment of the present invention is described below, where an embodiment of the device for identifying a same person in the embodiment of the present invention includes:
an obtaining module 501, configured to obtain sample user data in each service system;
the analysis module 502 is configured to perform offline analysis on the sample user data to obtain analysis data, and extract attribute data belonging to a preset attribute category from the analysis data as user attribute data;
a calculating module 503, configured to calculate similarity of attribute categories of each attribute data in the user attribute data to obtain a similarity value, and extract public attribute data from the user attribute data based on the similarity value;
a comparison module 504, configured to compare public attribute values in the public attribute data to obtain a comparison result, and analyze the comparison result to obtain a peer identification rule;
the training module 505 is configured to train a preset recognition tool by using the peer recognition rule and the public attribute data as training corpora to obtain a peer recognition model;
the identification module 506 is configured to search user data corresponding to a user to be identified, input the user data into the same-person identification model for identification, and determine whether users corresponding to the user data in different service systems are the same person based on an identification result.
According to the embodiment of the invention, the same-person identification of the user to be identified can be realized by the step of operating the same-person identification method based on the same-person identification model by the same-person identification device, and the device has high identification efficiency and high accuracy.
Referring to fig. 6, another embodiment of the same person identification apparatus in the embodiment of the present invention includes:
an obtaining module 501, configured to obtain sample user data in each service system;
the analysis module 502 is configured to perform offline analysis on the sample user data to obtain analysis data, and extract attribute data belonging to a preset attribute category from the analysis data as user attribute data;
a calculating module 503, configured to calculate similarity of attribute categories of each attribute data in the user attribute data to obtain a similarity value, and extract public attribute data from the user attribute data based on the similarity value;
a comparison module 504, configured to compare public attribute values in the public attribute data to obtain a comparison result, and analyze the comparison result to obtain a peer identification rule;
the training module 505 is configured to train a preset recognition tool by using the peer recognition rule and the public attribute data as training corpora to obtain a peer recognition model;
the identification module 506 is configured to search user data corresponding to a user to be identified, input the user data into the same-person identification model for identification, and determine whether users corresponding to the user data in different service systems are the same person based on an identification result.
Optionally, the parsing module 502 is specifically configured to:
performing off-line analysis on the sample user data based on a preset data analysis rule to obtain analysis data;
extracting attribute feature information of each attribute data from the analyzed data, and calculating semantic similarity between the attribute feature information and a preset attribute category to obtain a first similarity value;
and comparing the first similarity value with a preset similarity threshold, and if the first similarity value is not less than the preset similarity threshold, extracting corresponding attribute data from the analysis data as user attribute data.
Optionally, the calculating module 503 is specifically configured to:
in a semantic space, calculating semantic similarity of attribute categories of each attribute data in the user attribute data to obtain a second similarity value;
and comparing the second similarity value with a preset similarity threshold, and if the second similarity value is not less than the preset similarity threshold, extracting corresponding attribute data from the user attribute data as public attribute data.
Optionally, the identifying module 506 includes:
the extracting unit 5061 is configured to obtain user data of a user to be identified, and extract a user account in the user data;
the searching unit 5062 is configured to search user information data and first patient information data corresponding to the user account from the user data, where the user information data at least includes a user identification number, a user certificate number, and user basic identity information, and the first patient information data at least includes a first patient identification number, a first patient certificate number, and first patient basic identity information;
an analyzing unit 5063, configured to invoke the peer recognition model, and analyze the user information data and the patient information data belonging to the same user account to obtain a result of whether the user and the first patient are the same person.
Optionally, the analysis unit 5063 is specifically configured to:
calling the same-person identification model, and comparing whether the user identification number belonging to the same user account is consistent with the first patient identification number to obtain a first comparison result;
comparing whether the user certificate number belonging to the same certificate type under the same user account number is consistent with the first patient certificate number or not to obtain a second comparison result;
comparing whether the basic identity information of the user under the same account number is consistent with the basic identity information of the first patient or not to obtain a third comparison result, wherein the basic identity information at least comprises name, gender and date of birth;
determining that the user is the same person as the first patient when at least one of the first comparison result, the second comparison result and the third comparison result is consistent.
Optionally, the identification module 506 is specifically configured to:
extracting second patient information data and third patient information data which belong to the same user account in the user data of the user to be identified;
comparing whether the second patient information data is consistent with the third patient information data to obtain a comparison result;
and calling the same-person identification model, and analyzing the comparison result to obtain a result of whether the second patient and the third patient are the same person.
Optionally, the same-person recognition apparatus further includes a secondary training module 507, where the secondary training module 507 is specifically configured to:
using the user data subjected to the same-person recognition as a secondary training corpus;
and carrying out secondary training on the same-person recognition model based on the secondary training corpus to obtain the same-person recognition model after secondary training.
In the embodiment of the invention, the device can identify the same person for the user and the patient belonging to the same user account, and also can identify the same person for the patient and the patient belonging to the same user account, so that the efficiency of identifying the same person is improved, and the accuracy of the model is improved by performing secondary training on the constructed model, thereby improving the accuracy of the identification result.
Referring to fig. 7, an embodiment of the same person identification device in the embodiment of the present invention is described in detail below from the viewpoint of hardware processing.
Fig. 7 is a schematic structural diagram of a peer identification device 700 according to an embodiment of the present invention, where the peer identification device 700 may have a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 710 (e.g., one or more processors) and a memory 720, one or more storage media 730 (e.g., one or more mass storage devices) for storing applications 733 or data 732. Memory 720 and storage medium 730 may be, among other things, transient storage or persistent storage. The program stored on the storage medium 730 may include one or more modules (not shown), each of which may include a sequence of instructions operating on the personal identification device 700. Further, the processor 710 may be configured to communicate with the storage medium 730 to execute a series of instruction operations in the storage medium 730 on the personal identification device 700.
The peer identification device 700 may also include one or more power supplies 740, one or more wired or wireless network interfaces 750, one or more input-output interfaces 760, and/or one or more operating systems 731, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, etc. Those skilled in the art will appreciate that the peer recognition device configuration shown in fig. 7 does not constitute a limitation of the peer recognition device and may include more or fewer components than shown, or some components may be combined, or a different arrangement of components.
The present invention also provides a computer-readable storage medium, which may be a non-volatile computer-readable storage medium, and which may also be a volatile computer-readable storage medium, having stored therein instructions, which, when run on a computer, cause the computer to perform the steps of the method for identifying a fellow person based on a fellow person identification model.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. The same-person identification method based on the same-person identification model is characterized by comprising the following steps of:
acquiring sample user data in each service system;
performing off-line analysis on the sample user data to obtain analysis data, and extracting attribute data belonging to a preset attribute category from the analysis data as user attribute data;
calculating the similarity of the attribute categories of each attribute data in the user attribute data to obtain a similarity value, and extracting public attribute data from the user attribute data based on the similarity value;
comparing public attribute values in the public attribute data to obtain a comparison result, and analyzing the comparison result to obtain a same person identification rule;
taking the same-person recognition rule and the public attribute data as training corpora, and training a preset recognition tool to obtain a same-person recognition model;
and searching user data corresponding to the user to be identified in different service systems, inputting the user data into the same-person identification model for identification, and judging whether the users corresponding to the user data in the different service systems are the same person or not based on the identification result.
2. The method of claim 1, wherein the step of analyzing the sample user data offline to obtain analyzed data and extracting attribute data belonging to a preset attribute category from the analyzed data as user attribute data comprises:
performing off-line analysis on the sample user data based on a preset data analysis rule to obtain analysis data;
extracting attribute feature information of each attribute data from the analyzed data, and calculating semantic similarity between the attribute feature information and a preset attribute category to obtain a first similarity value;
and comparing the first similarity value with a preset similarity threshold, and if the first similarity value is not less than the preset similarity threshold, extracting corresponding attribute data from the analysis data as user attribute data.
3. The method of claim 2, wherein the calculating a similarity between attribute categories of the attribute data of the user attribute data to obtain a similarity value, and extracting common attribute data from the user attribute data based on the similarity value comprises:
in a semantic space, calculating semantic similarity of attribute categories of each attribute data in the user attribute data to obtain a second similarity value;
and comparing the second similarity value with a preset similarity threshold, and if the second similarity value is not less than the preset similarity threshold, extracting corresponding attribute data from the user attribute data as public attribute data.
4. The method of claim 3, wherein the searching for the user data corresponding to the user to be identified in the different business systems, inputting the user data into the same-person identification model for identification, and determining whether the users corresponding to the user data in the different business systems are the same person based on the identification result comprises:
searching user data corresponding to a user to be identified in different service systems, and extracting a user account in the user data;
searching user information data and first patient information data corresponding to the user account from user data, wherein the user information data at least comprises a user identification number, a user certificate number and user basic identity information, and the first patient information data at least comprises a first patient identification number, a first patient certificate number and first patient basic identity information;
and calling the same-person identification model, and analyzing the user information data and the patient information data belonging to the same user account to obtain a result of whether the user and the first patient are the same person.
5. The method of claim 4, wherein the invoking of the peer recognition model to analyze the user information data and the patient information data belonging to the same user account to obtain a result of whether the user and the first patient are the same person comprises:
calling the same-person identification model, and comparing whether the user identification number belonging to the same user account is consistent with the first patient identification number to obtain a first comparison result;
comparing whether the user certificate number belonging to the same certificate type under the same user account number is consistent with the first patient certificate number or not to obtain a second comparison result;
comparing whether the basic identity information of the user under the same account number is consistent with the basic identity information of the first patient or not to obtain a third comparison result, wherein the basic identity information at least comprises name, gender and date of birth;
determining that the user is the same person as the first patient when at least one of the first comparison result, the second comparison result and the third comparison result is consistent.
6. The method of claim 5, wherein the searching for the user data corresponding to the user to be identified in the different business systems, inputting the user data into the same-person identification model for identification, and determining whether the users corresponding to the user data in the different business systems are the same person based on the identification result comprises:
extracting second patient information data and third patient information data which belong to the same user account in the user data of the user to be identified;
comparing whether the second patient information data is consistent with the third patient information data to obtain a comparison result;
and calling the same-person identification model, and analyzing the comparison result to obtain a result of whether the second patient and the third patient are the same person.
7. The method for identifying the same person based on the same person identification model according to any one of claims 1-6, wherein after searching for the user data corresponding to the user to be identified in the different business systems, inputting the user data into the same person identification model for identification, and determining whether the users corresponding to the user data in the different business systems are the same person based on the identification result, the method further comprises:
using the user data subjected to the same-person recognition as a secondary training corpus;
and carrying out secondary training on the same-person recognition model based on the secondary training corpus to obtain the same-person recognition model after secondary training.
8. A homo-person recognition device, comprising:
the acquisition module is used for acquiring sample user data in each service system;
the analysis module is used for carrying out off-line analysis on the sample user data to obtain analysis data, and extracting attribute data belonging to a preset attribute class from the analysis data to serve as user attribute data;
the calculation module is used for calculating the similarity of the attribute categories of the attribute data in the user attribute data to obtain a similarity value, and extracting public attribute data from the user attribute data based on the similarity value;
the comparison module is used for comparing public attribute values in the public attribute data to obtain a comparison result, and analyzing the comparison result to obtain the identity recognition rule;
the training module is used for training a preset recognition tool by taking the same-person recognition rule and the public attribute data as training corpora to obtain a same-person recognition model;
and the identification module is used for searching user data corresponding to the user to be identified in different service systems, inputting the user data into the same-person identification model for identification, and judging whether the users corresponding to the user data in the different service systems are the same person or not based on the identification result.
9. A homo-person recognition device, characterized in that the homo-person recognition device comprises:
a memory having instructions stored therein and at least one processor, the memory and the at least one processor interconnected by a line;
the at least one processor invoking the instructions in the memory to cause the homo-recognition device to perform the steps of the homo-recognition model based homo-recognition method according to any one of claims 1-7.
10. A computer-readable storage medium having instructions stored thereon, which when executed by a processor implement the steps of the method for identifying a co-person based on a co-person identification model according to any one of claims 1-7.
CN202110433355.XA 2021-04-22 2021-04-22 Same-person identification method based on same-person identification model and related equipment Pending CN113139005A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110433355.XA CN113139005A (en) 2021-04-22 2021-04-22 Same-person identification method based on same-person identification model and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110433355.XA CN113139005A (en) 2021-04-22 2021-04-22 Same-person identification method based on same-person identification model and related equipment

Publications (1)

Publication Number Publication Date
CN113139005A true CN113139005A (en) 2021-07-20

Family

ID=76813422

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110433355.XA Pending CN113139005A (en) 2021-04-22 2021-04-22 Same-person identification method based on same-person identification model and related equipment

Country Status (1)

Country Link
CN (1) CN113139005A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709449A (en) * 2016-12-22 2017-05-24 深圳市深网视界科技有限公司 Pedestrian re-recognition method and system based on deep learning and reinforcement learning
US20180121618A1 (en) * 2016-11-02 2018-05-03 Cota Inc. System and method for extracting oncological information of prognostic significance from natural language
CN109472310A (en) * 2018-11-12 2019-03-15 深圳八爪网络科技有限公司 Determine the recognition methods and device that two parts of resumes are the identical talent
CN109829362A (en) * 2018-12-18 2019-05-31 深圳壹账通智能科技有限公司 Safety check aided analysis method, device, computer equipment and storage medium
CN110533085A (en) * 2019-08-12 2019-12-03 大箴(杭州)科技有限公司 With people's recognition methods and device, storage medium, computer equipment
CN110557447A (en) * 2019-08-26 2019-12-10 腾讯科技(武汉)有限公司 user behavior identification method and device, storage medium and server
CN110826525A (en) * 2019-11-18 2020-02-21 天津高创安邦技术有限公司 Face recognition method and system
CN111191503A (en) * 2019-11-25 2020-05-22 浙江省北大信息技术高等研究院 Pedestrian attribute identification method and device, storage medium and terminal

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180121618A1 (en) * 2016-11-02 2018-05-03 Cota Inc. System and method for extracting oncological information of prognostic significance from natural language
CN106709449A (en) * 2016-12-22 2017-05-24 深圳市深网视界科技有限公司 Pedestrian re-recognition method and system based on deep learning and reinforcement learning
CN109472310A (en) * 2018-11-12 2019-03-15 深圳八爪网络科技有限公司 Determine the recognition methods and device that two parts of resumes are the identical talent
CN109829362A (en) * 2018-12-18 2019-05-31 深圳壹账通智能科技有限公司 Safety check aided analysis method, device, computer equipment and storage medium
CN110533085A (en) * 2019-08-12 2019-12-03 大箴(杭州)科技有限公司 With people's recognition methods and device, storage medium, computer equipment
CN110557447A (en) * 2019-08-26 2019-12-10 腾讯科技(武汉)有限公司 user behavior identification method and device, storage medium and server
CN110826525A (en) * 2019-11-18 2020-02-21 天津高创安邦技术有限公司 Face recognition method and system
CN111191503A (en) * 2019-11-25 2020-05-22 浙江省北大信息技术高等研究院 Pedestrian attribute identification method and device, storage medium and terminal

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
季金鑫 等: "基于聚类的个人健康档案补全方法的研究与实现", 《东华大学学报(自然科学版)》, vol. 42, no. 4, pages 466 - 471 *

Similar Documents

Publication Publication Date Title
US10025904B2 (en) Systems and methods for managing a master patient index including duplicate record detection
CN110929125B (en) Search recall method, device, equipment and storage medium thereof
US20040249808A1 (en) Query expansion using query logs
CN110765275A (en) Search method, search device, computer equipment and storage medium
US10572461B2 (en) Systems and methods for managing a master patient index including duplicate record detection
CN110637316B (en) System and method for prospective object identification
CN109408821B (en) Corpus generation method and device, computing equipment and storage medium
CN113934868A (en) Government affair big data management method and system
CN107809370B (en) User recommendation method and device
CN114238573A (en) Information pushing method and device based on text countermeasure sample
CN112035757A (en) Medical waterfall flow pushing method, device, equipment and storage medium
CN110752027B (en) Electronic medical record data pushing method, device, computer equipment and storage medium
CN111400448A (en) Method and device for analyzing incidence relation of objects
CN111552798A (en) Name information processing method and device based on name prediction model and electronic equipment
JP2019086940A (en) Relevant score calculating system, method and program
CN113111159A (en) Question and answer record generation method and device, electronic equipment and storage medium
CN114253990A (en) Database query method and device, computer equipment and storage medium
CN113468160A (en) Data management method and device and electronic equipment
CN113326363A (en) Searching method and device, prediction model training method and device, and electronic device
CN111104422B (en) Training method, device, equipment and storage medium of data recommendation model
Drăgan et al. Linking semantic desktop data to the web of data
WO2023178970A1 (en) Medical data processing method, apparatus and device, and storage medium
CN112685389B (en) Data management method, data management device, electronic device, and storage medium
CN113139005A (en) Same-person identification method based on same-person identification model and related equipment
CN115510219A (en) Method and device for recommending dialogs, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination