CN110955753A - Data mapping method, device, equipment and storage medium - Google Patents

Data mapping method, device, equipment and storage medium Download PDF

Info

Publication number
CN110955753A
CN110955753A CN201911187112.1A CN201911187112A CN110955753A CN 110955753 A CN110955753 A CN 110955753A CN 201911187112 A CN201911187112 A CN 201911187112A CN 110955753 A CN110955753 A CN 110955753A
Authority
CN
China
Prior art keywords
standard
standard code
data
codes
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911187112.1A
Other languages
Chinese (zh)
Other versions
CN110955753B (en
Inventor
黄润桓
高英明
欧龙平
叶韶蘅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taikang Insurance Group Co Ltd
Original Assignee
Taikang Insurance Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taikang Insurance Group Co Ltd filed Critical Taikang Insurance Group Co Ltd
Priority to CN201911187112.1A priority Critical patent/CN110955753B/en
Publication of CN110955753A publication Critical patent/CN110955753A/en
Application granted granted Critical
Publication of CN110955753B publication Critical patent/CN110955753B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides a data mapping method, a device, equipment and a storage medium, wherein the method comprises the following steps: the data normalization apparatus matches the service data received from the service server with one base standard code of each of the at least one standard code set. If the service data is successfully matched with the basic standard codes of the target standard code set in at least one standard code set, the data standardization equipment selects one target standard code from the target standard code set and establishes the mapping relation between the service data and the target standard code. Therefore, compared with the existing full matching mode, the data standardization device in the embodiment of the application does not need to match the service data with each standard code in each standard code set, so that the matching efficiency is improved.

Description

Data mapping method, device, equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a data mapping method, a data mapping device, data mapping equipment and a storage medium.
Background
The medical data standardization platform is a basic project of an insurance unit and is a precondition for accessing and identifying medical data sent by health risk related equipment.
In general, the format of the medical data transmitted by the health risk related device is greatly different from the format of the standard data in the existing medical data standardization device. The existing medical data standardization equipment needs to compare the similarity between the received medical data and all standard data through a similarity algorithm, and when the similarity between the medical data and any standard data is larger than a preset similarity threshold, a mapping relation between the medical data and the standard data is established so as to facilitate subsequent underwriting operation and the like.
However, when receiving each piece of medical data, the existing medical data standardization equipment needs to perform similarity matching with all standard data, and thus, the efficiency of the existing matching mode is low.
Disclosure of Invention
The embodiment of the application provides a data mapping method, a data mapping device, data mapping equipment and a storage medium, and solves the technical problem that the existing matching mode is low in efficiency.
In a first aspect, an embodiment of the present application provides a data mapping method, including:
receiving service data sent by a service server by data standardization equipment, wherein the data standardization equipment comprises M standard codes, the M standard codes belong to N standard code sets, and N is more than or equal to 1 and less than or equal to M, and N, M are integers;
the data standardization equipment matches the service data with one basic standard code of each standard code set in at least one standard code set;
and if the service data is successfully matched with the basic standard codes of the target standard code set in at least one standard code set, the data standardization equipment selects one target standard code in the target standard code set and establishes the mapping relation between the service data and the target standard code.
In a possible implementation manner, before the data normalization apparatus matches the service data with one base standard code of each standard code set of at least one standard code set, the data normalization apparatus further includes:
the data normalization apparatus establishes the N standard encoding sets.
In one possible implementation, the data normalization apparatus establishes the N standard encoding sets, including:
the data normalization device calculates the similarity between a first standard code and other standard codes in the M standard codes, and establishes the other standard codes with the similarity larger than a preset threshold value with the first standard code as one standard code set, wherein the first standard code is any standard code in the M standard codes;
and the data standardization equipment eliminates the standard codes which are divided into the standard code set from the M standard codes, selects one standard code from the rest standard codes, takes the standard code as a new first standard code, and continuously calculates the similarity between the new first standard code and other standard codes in the rest standard codes until each standard code is divided into the corresponding standard code set.
In a possible implementation manner, the data normalization apparatus matches the service data with one base standard code of each standard code set of at least one standard code set, and includes:
the data standardization equipment matches the service data with the basic standard codes of the first standard code set of the N standard code sets according to a preset sequence;
if the service data is successfully matched with the basic standard codes of the first standard code set, stopping;
otherwise, the data standardization equipment matches the service data with the basic standard codes of a second standard code set in the N standard code sets according to the preset sequence until one basic standard code is matched; or until there is no match to any of the base standard codes.
In one possible implementation, the data normalization apparatus selects one target standard code from the target standard code set, and includes:
and the data standardization equipment selects the standard code with the highest similarity with the service data in the target standard code set as the target standard code.
In one possible implementation, the data normalization apparatus selects one target standard code from the target standard code set, and includes:
the data normalization apparatus encodes a base standard of the target standard encoding set as the target standard encoding.
In one possible implementation, the data normalization apparatus selects one target standard code from the target standard code set, and includes:
the data normalization apparatus randomly selects one standard code from the target standard code set as the target standard code.
In a second aspect, an embodiment of the present application provides a data mapping apparatus, including:
a receiving module, configured to receive service data sent by a service server, where the data normalization device includes M standard codes, the M standard codes belong to N standard code sets, and N < M > is equal to or less than 1, and N, M are integers;
the matching module is used for matching the service data with one basic standard code of each standard code set in at least one standard code set;
the first establishing module is used for selecting a target standard code from the target standard code set and establishing a mapping relation between the service data and the target standard code if the matching module successfully matches the service data with the basic standard code of the target standard code set in at least one standard code set.
In a possible implementation manner, the data mapping apparatus further includes:
and the second establishing module is used for establishing the N standard coding sets.
In a possible implementation manner, the second establishing module is specifically configured to:
calculating the similarity between a first standard code and other standard codes in the M standard codes, and establishing the first standard code as one standard code set, wherein the first standard code is any standard code in the M standard codes;
and removing the standard codes divided into the standard code set from the M standard codes, selecting one standard code from the rest standard codes, using the standard code as a new first standard code, and continuously calculating the similarity between the new first standard code and other standard codes in the rest standard codes until each standard code is divided into the corresponding standard code set.
In a possible implementation manner, the matching module is specifically configured to:
matching the service data with the basic standard codes of the first standard code set of the N standard code sets according to a preset sequence;
if the service data is successfully matched with the basic standard codes of the first standard code set, stopping;
otherwise, matching the service data with the basic standard codes of a second standard code set in the N standard code sets according to the preset sequence until one basic standard code is matched; or until there is no match to any of the base standard codes.
In a possible implementation manner, the first establishing module is specifically configured to:
and selecting the standard code with the highest similarity with the service data in the target standard code set as the target standard code.
In a possible implementation manner, the first establishing module is specifically configured to:
and coding the basic standard of the target standard coding set as the target standard coding.
In a possible implementation manner, the first establishing module is specifically configured to:
randomly selecting one standard code in the target standard code set as the target standard code.
In a third aspect, an embodiment of the present application provides a data normalization apparatus, including:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the method of any implementation of the first aspect described above via execution of the executable instructions.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method described in any implementation manner of the first aspect.
The embodiment of the application provides a data mapping method, a data mapping device, data mapping equipment and a storage medium, wherein the data standardization equipment matches service data received from a service server with one basic standard code of each standard code set in at least one standard code set. If the service data is successfully matched with the basic standard codes of the target standard code set in at least one standard code set, the data standardization equipment selects one target standard code from the target standard code set and establishes the mapping relation between the service data and the target standard code. It can be seen that, compared with the existing full matching method, in the embodiment of the present application, the data normalization device only needs to perform similarity matching on the service data and the base standard codes corresponding to at least one standard code set, and then selects one target standard code from the target standard code set to establish a mapping relationship when the service data is successfully matched with the base standard codes of the target standard code set in the at least one standard code set, without matching the service data with each standard code in each standard code set, so as to improve matching efficiency.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to these drawings without inventive exercise.
Fig. 1 is a schematic view of an application scenario provided in an embodiment of the present application;
fig. 2 is a schematic flowchart of a data mapping method according to an embodiment of the present application;
fig. 3 is a schematic diagram of a matching process provided in an embodiment of the present application;
fig. 4 is a schematic flowchart of a data mapping method according to another embodiment of the present application;
fig. 5 is a schematic diagram illustrating a process of establishing a standard encoding set according to an embodiment of the present application;
fig. 6 is a schematic flowchart of a data mapping method according to another embodiment of the present application;
fig. 7 is a schematic structural diagram of a data mapping apparatus according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a data normalization device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Fig. 1 is a schematic view of an application scenario provided in an embodiment of the present application. As shown in fig. 1, the application scenario diagram of the embodiment of the present application may include, but is not limited to: the service server and the data standardization equipment provided by the embodiment of the application are provided. Of course, the application scenario schematic diagram of the embodiment of the present application may further include other devices, for example, a terminal corresponding to the data standardization device, and the like, which is not limited in the embodiment of the present application.
In the embodiment of the present application, based on the similarity transitivity (for example, if a is similar to B and has a similarity greater than 50%, B is similar to C and has a similarity greater than 50%, and the similarity decreases as the number of times of transmission increases), the data normalization device pre-divides all standard codes into at least one standard code set, so that when receiving service data (where the service data may include, but is not limited to, medical data) sent by the service server, the received service data may be matched with one base standard code of each standard code set in the at least one standard code set. If the service data is successfully matched with the basic standard codes of the target standard code set in at least one standard code set, the data standardization equipment selects one target standard code in the target standard code set and establishes the mapping relation between the service data and the target standard code. Therefore, compared with the existing full matching mode, the data standardization device in the embodiment of the application does not need to match the service data with each standard code in each standard code set, so that the matching efficiency is improved.
The main body of the data mapping method provided in the embodiments of the present application may be the data standardization equipment, or may be a data mapping device in the data standardization equipment (it should be noted that, in the embodiments provided in the present application, the data standardization equipment is described as an example). For example, the data normalization apparatus or the data mapping device therein in the embodiments of the present application may be implemented by software and/or hardware.
For example, the data standardization device provided by the embodiment of the present application may be a data standardization server.
Illustratively, the service server in the embodiment of the present application may correspond to a health risk related device in the prior art, and may include, but is not limited to: a data management device of a hospital, a history policy data management device of an insurance company, and a medical data management device of a health management company.
Illustratively, any standard encoding set referred to in the embodiments of the present application includes at least one standard encoding, and the standard encoding set includes one base standard encoding, and the base standard encoding may refer to a standard encoding whose similarity with other standard encodings in the standard encoding set is greater than a preset threshold 1. It should be understood that if a standard code is included in a standard code set, the standard code is the base standard code of the standard code set.
For example, a base standard encoding in any standard encoding set may refer to a first standard encoding in the standard encoding set.
The technical solution of the present application will be described in detail below with specific examples. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.
Fig. 2 is a schematic flowchart of a data mapping method according to an embodiment of the present application. As shown in fig. 2, the method of the embodiment of the present application may include:
step S201, the data standardization device receives service data sent by the service server.
In this step, the data normalization device receives service data sent by the service server, where the service data may include, but is not limited to, medical data.
It should be understood that the service server may actively send the service data to the data standardization device, or the service server may send the service data to the data standardization device after receiving the acquisition instruction sent by the data standardization device. Of course, the service server may also send the service data to the data standardization device in other situations, which is not limited in this embodiment of the application.
Step S202, the data normalization device matches the service data with a base standard code of each standard code set in at least one standard code set.
In this embodiment of the present application, the data normalization device may include M standard codes, where the M standard codes belong to N standard code sets, where 1 ≦ N < M, and N, M are integers.
Illustratively, any standard encoding set referred to in the embodiments of the present application includes at least one standard encoding, and the standard encoding set includes one base standard encoding, and the base standard encoding may refer to a standard encoding whose similarity with other standard encodings in the standard encoding set is greater than a preset threshold 1. For example, assume that an arbitrary set of standard codes includes: the standard encoding method includes standard encoding 1, standard encoding 2 and standard encoding 3, where a similarity between the standard encoding 1 and the standard encoding 2 is greater than a preset threshold 1, and a similarity between the standard encoding 1 and the standard encoding 3 is also greater than the preset threshold 1, and then the standard encoding 1 may be a basic standard encoding of the standard encoding set.
It should be understood that if a standard code is included in a standard code set, the standard code is the base standard code of the standard code set.
For example, a base standard encoding in any standard encoding set may refer to a first standard encoding in the standard encoding set.
In this step, the data normalization device may perform similarity matching on the service data and a basic standard code corresponding to at least one standard code set of the N standard code sets according to a preset sequence. For example, if the similarity between the service data and the base standard code corresponding to a standard code set of the N standard code sets is greater than a preset threshold 2, the data normalization device may consider that the service data is successfully matched with the base standard code of the standard code set (or referred to as a target standard code set).
For another example, if the similarity between the service data and the base standard code corresponding to one of the N standard code sets is not greater than the preset threshold 2, the data normalization device may consider that the service data fails to match the base standard code of the standard code set.
It should be understood that, if the similarity between the service data and the base standard code corresponding to a standard code set of the N standard code sets is equal to a preset threshold 2, the data normalization device may also consider that the service data is successfully matched with the base standard code of the standard code set (or referred to as a target standard code set). Correspondingly, if the similarity between the service data and the basic standard code corresponding to one of the N standard code sets is smaller than a preset threshold 2, the data normalization device may consider that the service data fails to match the basic standard code of the standard code set.
It should be noted that the preset threshold 2 and the preset threshold 1 may be the same or different.
Compared with the existing full matching mode, the data standardization device in the step only needs to match the similarity between the service data and the basic standard codes corresponding to at least one standard code set, and does not need to match the service data with each standard code in each standard code set, so that the matching efficiency can be improved.
For example, in this embodiment of the application, when the data normalization apparatus matches the service data with the base standard codes of any standard code set, at least one of the following algorithms may be used: editing a distance algorithm, a cosine similarity algorithm and an Euclidean distance algorithm; of course, other similarity matching algorithms may also be used, which is not limited in the embodiments of the present application.
Optionally, the data normalization device may match the service data with the base standard codes of the first standard code set of the N standard code sets according to a preset sequence.
In this embodiment of the application, the data normalization device may perform pre-ordering on at least one received service data to form a service data queue, and perform pre-ordering on a standard code set of the N standard code sets to form a standard code queue, so as to sequentially match the at least one service data according to a preset sequence, where when each service data is matched, the data normalization device may perform matching with a basic standard code of the standard code set of the N standard code sets according to the preset sequence.
Fig. 3 is a schematic diagram of a matching process provided in the embodiment of the present application, and as shown in fig. 3, it is assumed that the N standard code sets may include, according to a preset sequence: standard encoding set 1, standard encoding set 2, standard encoding set 3, and standard encoding set 4. Wherein, the standard coding set 1 comprises: standard code a (base standard code), standard code D, and standard code E; the standard encoding set 2 includes: standard code B (base standard code), standard code C, and standard code F; the standard encoding set 3 includes: standard code G (base standard code) and standard code I; the standard encoding set 4 includes: standard code H (base standard code).
As shown in fig. 3, the data normalization apparatus may perform similarity matching on the service data (e.g., service data 1) and the base standard code (e.g., standard code a) of the first standard code set (e.g., standard code set 1) of the N standard code sets according to a preset order. If the matching of the basic standard codes (for example, standard code a) of the service data (for example, service data 1) and the first standard code set (for example, standard code set 1) is successful, the data normalization apparatus stops the similarity matching of the service data (for example, service data 1) and determines that the first standard code set (for example, standard code set 1) is the target standard code set.
Alternatively, if the service data (e.g., the service data 1) and the base standard code (e.g., the standard code a) of the first standard code set (e.g., the standard code set 1) fail to be matched, the data normalization apparatus may continue to match the service data (e.g., the service data 1) and the base standard code (e.g., the standard code B) of the second standard code set (e.g., the standard code set 2) of the N standard code sets according to a preset order. If the matching of the basic standard code (for example, standard code B) of the service data (for example, service data 1) and the second standard code set (for example, standard code set 2) in the N standard code sets is successful, the data normalization apparatus stops the similarity matching of the service data (for example, service data 1) and determines that the second standard code set (for example, standard code set 2) is the target standard code set.
Or, if the matching of the service data (e.g., service data 1) and the base standard codes (e.g., standard code B) of the second standard code set (e.g., standard code set 2) still fails, the data normalization apparatus may continue to match the service data (e.g., service data 1) and the base standard codes (e.g., standard code G) of the third standard code set (e.g., standard code set 3) of the N standard code sets according to a preset order, …, and so on until a base standard code is matched; or until the code is not matched with any one of the basic standard codes, and then similarity matching can be carried out on the next service data.
As shown in fig. 3, if the matching of the basic standard codes (e.g., standard code G) of the service data (e.g., service data 1) and the third standard code set (e.g., standard code set 3) is successful, the data normalization apparatus stops the similarity matching of the service data (e.g., service data 1), and determines that the third standard code set (e.g., standard code set 3) is the target standard code set.
It should be noted that, after performing similarity matching on the service data (for example, the service data 1), the data standardization device may continue to perform similarity matching on the service data 2 and the service data 3 in sequence, and a specific matching process may be the same as the process of performing similarity matching on the service data 1, and is not described herein again.
Step S203, if the service data is successfully matched with the basic standard code of the target standard code set in at least one standard code set, the data normalization apparatus selects a target standard code in the target standard code set, and establishes a mapping relationship between the service data and the target standard code.
In this step, if the service data (e.g., the service data 1) is successfully matched with the basic standard code (e.g., the standard code G) of the target standard code set (e.g., the standard code set 3), the data normalization device may select a target standard code from the target standard code set, and establish a mapping relationship between the service data and the target standard code, so as to facilitate subsequent operations such as underwriting.
In a possible implementation manner, the data normalization device may select, as the target standard code, a standard code with the highest similarity to the service data from the target standard code set according to the similarity between the service data (e.g., service data 1) and each standard code in the target standard code set (e.g., standard code set 3). For example, as shown in fig. 3, assuming that the similarity between the service data 1 and the standard code G in the standard code set 3 is greater than the similarity between the service data 1 and the standard code I in the standard code set 3, the data normalization apparatus may select the standard code G as the target standard code in the standard code set 3, so as to establish the mapping relationship between the service data 1 and the standard code G.
In another possible implementation manner, the data normalization apparatus may encode a base standard (e.g., standard code G) of the target standard code set (e.g., standard code set 3) as the target standard code.
In another possible implementation manner, the data normalization apparatus may randomly select one standard code from the target standard code set (e.g., standard code set 3) as the target standard code.
In an embodiment of the present application, the data normalization apparatus performs matching between the service data received from the service server and one base standard code of each standard code set of the at least one standard code set. If the service data is successfully matched with the basic standard codes of the target standard code set in at least one standard code set, the data standardization equipment selects one target standard code from the target standard code set and establishes the mapping relation between the service data and the target standard code. It can be seen that, compared with the existing full matching method, in the embodiment of the present application, the data normalization device only needs to perform similarity matching on the service data and the base standard codes corresponding to at least one standard code set, and then selects one target standard code from the target standard code set to establish a mapping relationship when the service data is successfully matched with the base standard codes of the target standard code set in the at least one standard code set, without matching the service data with each standard code in each standard code set, so as to improve matching efficiency.
Table 1 is a schematic table of matching efficiency of an existing full-scale matching method, and table 2 is a schematic table of matching efficiency of a matching method provided in the embodiment of the present application. With reference to tables 1 and 2, for 27944 pieces of service data and 27123 pieces of standard codes, the average value of similarity matching time in the existing full-scale matching method is 588.7262 s; however, the time average value of similarity matching in the matching method provided by the embodiment of the present application is only 225.502 s. Therefore, compared with the existing full-quantity matching mode, the matching efficiency of the matching mode provided by the embodiment of the application is improved remarkably.
Table 1 is a schematic table of matching efficiency of the conventional full-scale matching method
Figure BDA0002292649440000111
Figure BDA0002292649440000121
Table 2 is a schematic table of matching efficiency of the matching method provided in the embodiment of the present application
Figure BDA0002292649440000122
Optionally, in order to facilitate the data normalization device to perform matching after receiving the service data, the data normalization device may pre-establish the N standard code sets before performing the step S202, so as to pre-divide the M standard codes in the data normalization device into the N standard code sets.
Fig. 4 is a schematic flowchart of a data mapping method according to another embodiment of the present application. On the basis of the foregoing embodiments, an implementation manner of the data normalization apparatus establishing the N standard encoding sets is described in this embodiment. As shown in fig. 4, the method of the embodiment of the present application may include:
step S401, the data normalization device calculates similarity between the first standard code and other standard codes in the M standard codes, and establishes other standard codes having similarity greater than a preset threshold 1 with the first standard code as a standard code set.
In this step, the data normalization device may select any standard code from the M standard codes as a first standard code, calculate the similarity between the first standard code and another standard code except the first standard code from the M standard codes, and establish another standard code having a similarity greater than a preset threshold 1 with the first standard code as a standard code set; the first standard code may be a base standard code of the standard code set.
For example, the data normalization device may also pre-order the M standard codes, which is beneficial to establish a standard code set in a more orderly manner.
For example, in this step, when calculating the similarity between the first standard code and the standard codes other than the first standard code in the M standard codes, the data normalization apparatus may adopt at least one of the following algorithms: editing a distance algorithm, a cosine similarity algorithm and an Euclidean distance algorithm; of course, other similarity calculation algorithms may also be used, which are not limited in the embodiments of the present application.
Fig. 5 is a schematic diagram of a process for establishing a standard code set according to an embodiment of the present application, and as shown in fig. 5, it is assumed that M standard codes in the data normalization apparatus include: the data normalization device may select a standard code a from the M standard codes as a first standard code, calculate similarity between the standard code a and other standard codes except the standard code a from the M standard codes, and establish the standard code a and other standard codes (e.g., the standard code D and the standard code E) having similarity greater than a preset threshold 1 with the standard code a as a standard code set 1, where the standard code a may be a base standard code of the standard code set 1.
Step S402, the data normalization device removes the standard codes that have been divided into the standard code set from the M standard codes, selects one standard code from the remaining standard codes, uses the standard code as a new first standard code, and continues to calculate the similarity between the new first standard code and the other standard codes in the remaining standard codes until each standard code is divided into the corresponding standard code set.
In this step, the data normalization apparatus removes the standard codes divided into the standard code set from the M standard codes, and selects one standard code from the remaining standard codes as a new first standard code. Further, the data normalization device continues to calculate the similarity between the new first standard code and other standard codes except the new first standard code in the remaining standard codes, and establishes other standard codes with the similarity between the new first standard code and the new first standard code as a new standard code set, wherein the similarity between the other standard codes is larger than a preset threshold value 1; the new first standard code may be a base standard code of the new standard code set. Further, the data normalization apparatus removes the standard codes that have been classified into the new standard code set from the remaining standard codes, … …, and so on, until each of the standard codes in the data normalization apparatus is classified into a corresponding standard code set.
As shown in fig. 5, the data normalization apparatus removes the standard code a, the standard code D, and the standard code E, which have been divided into the standard code set 1, from the M standard codes, and may select the standard code B as a new first standard code among the remaining standard codes (e.g., the standard code B, the standard code C, the standard code F, the standard code G, the standard code H, and the standard code I). Further, the data normalization device continues to calculate similarities between the standard code B and the standard codes C, F, G, H, and I, respectively, and establishes the standard code B and other standard codes (e.g., the standard code C and the standard code F) whose similarities with the standard code B are greater than a preset threshold 1 as a standard code set 2, where the standard code B may be a base standard code of the standard code set 2. Further, the data normalization apparatus eliminates the standard code B, the standard code C, and the standard code F that have been divided into the standard code set 2 from the remaining standard codes, … …, and so on, until each of the standard codes in the data normalization apparatus is divided into a corresponding standard code set, for example, the data normalization apparatus establishes the standard code G and the standard code I as a standard code set 3 and the standard code H as a standard code set 4.
In this embodiment, the data normalization device calculates similarities between the first standard code and other standard codes in the M standard codes, and establishes the other standard codes, the similarities between which and the first standard code are greater than a preset threshold 1, and the first standard code as a standard code set. Further, the data normalization device removes the standard codes divided into the standard code set from the M standard codes, selects one of the remaining standard codes as a new first standard code, and continues to calculate the similarity between the new first standard code and the other standard codes in the remaining standard codes until each standard code is divided into the corresponding standard code set. It can be seen that, in the embodiment of the present application, the data normalization device divides the M standard codes into N standard code sets by dividing different standard codes, of the M standard codes, whose similarity is greater than a preset threshold 1 into the same standard code set, so that when receiving service data, it is only necessary to perform similarity matching on the service data and a basic standard code corresponding to at least one standard code set, and thus matching efficiency can be improved.
Fig. 6 is a schematic flowchart of a data mapping method according to another embodiment of the present application. On the basis of the above embodiments, a complete flow of the data mapping method is introduced in the embodiments of the present application. As shown in fig. 6, the complete flow may include, but is not limited to: the method comprises an initialization stage, a user operation stage, a standard code set establishing stage, a service data and standard code matching stage and a data mapping stage.
1) Initialization phase
Wherein the initialization phase may include, but is not limited to: loading a historical standard code set, loading downtime recovery data, initializing a standard code queue, initializing a service data queue, initializing a cache and a distributed lock, and initializing a preset threshold (for example, the preset threshold 1 and/or the preset threshold 2).
2) User operation phase
Wherein, the user operation phase may include but is not limited to: adding standard codes, deleting the standard codes, importing the standard codes, and triggering the matching of the service data and the standard codes. It should be understood that the user can perform the operation in the user operation phase through the terminal corresponding to the data standardization device.
3) Establishment phase of standard coding set
Wherein, the establishing stage of the standard encoding set may include but is not limited to: adding standard codes into a standard code queue, calculating similarity, establishing a standard code set, storing the standard code set into a database and caching.
4) Matching phase of service data and standard code
The matching stage of the service data and the standard code may include, but is not limited to: adding the service data into a service data queue, loading a standard code set (for example, the standard code set established in the establishing stage of the historical standard code set and the standard code set), and matching the service data and the standard code.
5) Data mapping phase
The data mapping phase may include, but is not limited to: and establishing a mapping relation and a persistent mapping relation.
For specific implementation processes in the embodiments of the present application, reference may be made to relevant contents in the above embodiments of the present application, and details are not described here.
Fig. 7 is a schematic structural diagram of a data mapping apparatus according to an embodiment of the present application. Optionally, the data mapping apparatus provided in this embodiment may be an apparatus in the data standardization device. As shown in fig. 7, the data mapping apparatus 70 provided in the embodiment of the present application may include: a receiving module 701, a matching module 702 and a first establishing module 703.
The receiving module 701 is configured to receive service data sent by a service server, where the data normalization device includes M standard codes, the M standard codes belong to N standard code sets, and both 1 ≦ N < M and N, M are integers;
a matching module 702, configured to match the service data with one base standard code of each standard code set in at least one standard code set;
a first establishing module 703, configured to select a target standard code from the target standard code set if the matching module successfully matches the service data with a base standard code of a target standard code set in at least one standard code set, and establish a mapping relationship between the service data and the target standard code.
In a possible implementation manner, the data mapping apparatus 70 further includes:
and the second establishing module is used for establishing the N standard coding sets.
In a possible implementation manner, the second establishing module is specifically configured to:
calculating the similarity between a first standard code and other standard codes in the M standard codes, and establishing the first standard code as one standard code set, wherein the first standard code is any standard code in the M standard codes;
and removing the standard codes divided into the standard code set from the M standard codes, selecting one standard code from the rest standard codes, using the standard code as a new first standard code, and continuously calculating the similarity between the new first standard code and other standard codes in the rest standard codes until each standard code is divided into the corresponding standard code set.
In a possible implementation manner, the matching module 702 is specifically configured to:
matching the service data with the basic standard codes of the first standard code set of the N standard code sets according to a preset sequence;
if the service data is successfully matched with the basic standard codes of the first standard code set, stopping;
otherwise, matching the service data with the basic standard codes of a second standard code set in the N standard code sets according to the preset sequence until one basic standard code is matched; or until there is no match to any of the base standard codes.
In a possible implementation manner, the first establishing module 703 is specifically configured to:
and selecting the standard code with the highest similarity with the service data in the target standard code set as the target standard code.
In a possible implementation manner, the first establishing module 703 is specifically configured to:
and coding the basic standard of the target standard coding set as the target standard coding.
In a possible implementation manner, the first establishing module 703 is specifically configured to:
randomly selecting one standard code in the target standard code set as the target standard code.
The data mapping apparatus 70 provided in this embodiment may be used to implement the technical solution in the foregoing data mapping method embodiments of the present application, and the implementation principle and the technical effect are similar, which are not described herein again.
Fig. 8 is a schematic structural diagram of a data normalization device according to an embodiment of the present application. As shown in fig. 8, the data normalization apparatus 80 provided in the embodiment of the present application may include: a processor 801 and a memory 802. Optionally, the data normalization apparatus 80 may further include a transceiver 803, and the transceiver 803 is used for communication with other apparatuses.
The memory 802 is used for storing executable instructions of the processor 801; the processor 801 is configured to execute the technical solution in the above-mentioned data mapping method embodiments of the present application by executing the executable instructions, and the implementation principle and the technical effect are similar, which are not described herein again.
The embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the technical solution in the embodiment of the data mapping method of the present application is implemented, and the implementation principle and the technical effect are similar, and are not described herein again.
It should be understood by those of ordinary skill in the art that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of the processes should be determined by their functions and inherent logic, and should not limit the implementation process of the embodiments of the present application.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims (10)

1. A data mapping method, comprising:
receiving service data sent by a service server by data standardization equipment, wherein the data standardization equipment comprises M standard codes, the M standard codes belong to N standard code sets, and N is more than or equal to 1 and less than or equal to M, and N, M are integers;
the data standardization equipment matches the service data with one basic standard code of each standard code set in at least one standard code set;
and if the service data is successfully matched with the basic standard codes of the target standard code set in at least one standard code set, the data standardization equipment selects one target standard code in the target standard code set and establishes the mapping relation between the service data and the target standard code.
2. The method of claim 1, wherein before the data normalization device matches the traffic data with one base standard code of each of at least one of the standard code sets, the method further comprises:
the data normalization apparatus establishes the N standard encoding sets.
3. The method of claim 2, wherein the data normalization device establishes the N standard encoding sets, comprising:
the data normalization device calculates the similarity between a first standard code and other standard codes in the M standard codes, and establishes the other standard codes with the similarity larger than a preset threshold value with the first standard code as one standard code set, wherein the first standard code is any standard code in the M standard codes;
and the data standardization equipment eliminates the standard codes which are divided into the standard code set from the M standard codes, selects one standard code from the rest standard codes, takes the standard code as a new first standard code, and continuously calculates the similarity between the new first standard code and other standard codes in the rest standard codes until each standard code is divided into the corresponding standard code set.
4. The method according to any of claims 1-3, wherein the data normalization device matches the traffic data with one base standard code of each of at least one of the standard code sets, comprising:
the data standardization equipment matches the service data with the basic standard codes of the first standard code set of the N standard code sets according to a preset sequence;
if the service data is successfully matched with the basic standard codes of the first standard code set, stopping;
otherwise, the data standardization equipment matches the service data with the basic standard codes of a second standard code set in the N standard code sets according to the preset sequence until one basic standard code is matched; or until there is no match to any of the base standard codes.
5. The method of any of claims 1-3, wherein the data normalization device selects a target standard code from the set of target standard codes, comprising:
and the data standardization equipment selects the standard code with the highest similarity with the service data in the target standard code set as the target standard code.
6. The method of any of claims 1-3, wherein the data normalization device selects a target standard code from the set of target standard codes, comprising:
the data normalization apparatus encodes a base standard of the target standard encoding set as the target standard encoding.
7. The method of any of claims 1-3, wherein the data normalization device selects a target standard code from the set of target standard codes, comprising:
the data normalization apparatus randomly selects one standard code from the target standard code set as the target standard code.
8. A data mapping apparatus, comprising:
a receiving module, configured to receive service data sent by a service server, where the data normalization device includes M standard codes, the M standard codes belong to N standard code sets, and N < M > is equal to or less than 1, and N, M are integers;
the matching module is used for matching the service data with one basic standard code of each standard code set in at least one standard code set;
the first establishing module is used for selecting a target standard code from the target standard code set and establishing a mapping relation between the service data and the target standard code if the matching module successfully matches the service data with the basic standard code of the target standard code set in at least one standard code set.
9. A data normalization apparatus, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the method of any of claims 1-7 via execution of the executable instructions.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method of any one of claims 1-7.
CN201911187112.1A 2019-11-28 2019-11-28 Data mapping method, device, equipment and storage medium Active CN110955753B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911187112.1A CN110955753B (en) 2019-11-28 2019-11-28 Data mapping method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911187112.1A CN110955753B (en) 2019-11-28 2019-11-28 Data mapping method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110955753A true CN110955753A (en) 2020-04-03
CN110955753B CN110955753B (en) 2023-04-18

Family

ID=69978689

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911187112.1A Active CN110955753B (en) 2019-11-28 2019-11-28 Data mapping method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110955753B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112069774A (en) * 2020-09-03 2020-12-11 微医云(杭州)控股有限公司 Data mapping method and device, electronic terminal and storage medium
CN114461714A (en) * 2022-01-13 2022-05-10 湖北国际物流机场有限公司 BIM code conversion system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845058A (en) * 2015-12-04 2017-06-13 北大医疗信息技术有限公司 The standardized method of disease data and modular station
US20170235887A1 (en) * 2016-02-17 2017-08-17 International Business Machines Corporation Cognitive Mapping and Validation of Medical Codes Across Medical Systems
CN109584975A (en) * 2018-11-21 2019-04-05 金色熊猫有限公司 Medical data standardization processing method and device
CN109857736A (en) * 2018-12-29 2019-06-07 苏州市环亚数据技术有限公司 The data encoding of hospital's heterogeneous system unitized method and system, equipment, medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845058A (en) * 2015-12-04 2017-06-13 北大医疗信息技术有限公司 The standardized method of disease data and modular station
US20170235887A1 (en) * 2016-02-17 2017-08-17 International Business Machines Corporation Cognitive Mapping and Validation of Medical Codes Across Medical Systems
CN109584975A (en) * 2018-11-21 2019-04-05 金色熊猫有限公司 Medical data standardization processing method and device
CN109857736A (en) * 2018-12-29 2019-06-07 苏州市环亚数据技术有限公司 The data encoding of hospital's heterogeneous system unitized method and system, equipment, medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112069774A (en) * 2020-09-03 2020-12-11 微医云(杭州)控股有限公司 Data mapping method and device, electronic terminal and storage medium
CN114461714A (en) * 2022-01-13 2022-05-10 湖北国际物流机场有限公司 BIM code conversion system
CN114461714B (en) * 2022-01-13 2024-03-29 湖北国际物流机场有限公司 BIM code conversion system

Also Published As

Publication number Publication date
CN110955753B (en) 2023-04-18

Similar Documents

Publication Publication Date Title
KR102519076B1 (en) Apparatus and method for constructing polar code
CN110955753B (en) Data mapping method, device, equipment and storage medium
US20200184393A1 (en) Method and apparatus for determining risk management decision-making critical values
CN110489466B (en) Method and device for generating invitation code, terminal equipment and storage medium
WO2019019649A1 (en) Method and apparatus for generating investment portfolio product, storage medium and computer device
US20190140658A1 (en) System and method for high-speed transfer of small data sets
CN114600398A (en) Apparatus for multilevel encoding
CN110971560A (en) QAM signal modulation method and device and electronic equipment
CN108667464A (en) Polarization code encodes and method, sending device and the receiving device of decoding
CN112818387A (en) Method, apparatus, storage medium, and program product for model parameter adjustment
CN109412999B (en) Mapping method and device for probability modeling
CN114781654A (en) Federal transfer learning method, device, computer equipment and medium
CN116501997B (en) Short link generation method, device, electronic equipment and storage medium
CN104077272B (en) A kind of method and apparatus of dictionary compression
KR102339723B1 (en) Method, program, and appratus of decoding based on soft information of a dna storage device
CN110266834B (en) Area searching method and device based on internet protocol address
US11809752B2 (en) System, method, and computer program product for generating a data storage server distribution pattern
CN116506953A (en) Multi-channel switching method, system and medium applied to intelligent communication system
KR101592727B1 (en) Hybrid storage system using p2p and method for transmitting data using the same
JP2021501427A (en) Content-independent file indexing methods and systems
CN115952539A (en) Majority-of-offence robust privacy federated learning method, system, device, and medium
CN109951275A (en) Key generation method, device, computer equipment and storage medium
CN113900731B (en) Request processing method, device, equipment and storage medium
US9697073B1 (en) Systems and methods for handling parity and forwarded error in bus width conversion
US20230053844A1 (en) Improved Quality Value Compression Framework in Aligned Sequencing Data Based on Novel Contexts

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant