CN114020733A - Target data determination method and device, computer equipment and storage medium - Google Patents

Target data determination method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN114020733A
CN114020733A CN202111308637.3A CN202111308637A CN114020733A CN 114020733 A CN114020733 A CN 114020733A CN 202111308637 A CN202111308637 A CN 202111308637A CN 114020733 A CN114020733 A CN 114020733A
Authority
CN
China
Prior art keywords
data
source
target
candidate
matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111308637.3A
Other languages
Chinese (zh)
Inventor
巩帆帆
曾丽娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Ruosu Technology Co ltd
Original Assignee
Shanghai Ruosu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Ruosu Technology Co ltd filed Critical Shanghai Ruosu Technology Co ltd
Priority to CN202111308637.3A priority Critical patent/CN114020733A/en
Publication of CN114020733A publication Critical patent/CN114020733A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24558Binary matching operations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the specification provides a method and a device for determining standard data, computer equipment, a storage medium and a computer program product. According to the method, the source data are received and matched with the preposed data in the corresponding mapping relation set, and under the condition that the matching is successful, manual participation is not needed, and partial source data and target data are efficiently matched, so that automatic data cleaning is realized, and the data cleaning efficiency is improved. Under the condition of failed matching, continuing to query the source data in the candidate data set to obtain a query result set comprising at least one target candidate data, and determining the target data matched with the source data in the target candidate data to improve the accuracy of data matching. Further, when the enterprise performs data analysis by using the cleaned flow direction data, the accuracy of a data analysis result is improved, and decision making and product management and control of the enterprise are facilitated.

Description

Target data determination method and device, computer equipment and storage medium
Technical Field
The embodiment of the specification relates to the technical field of data processing in the pharmaceutical industry, in particular to a target data determination method, a target data determination device, a computer device, a storage medium and a computer program product.
Background
Throughout most industrial pharmaceutical enterprises, the classic mode of sales of adopted medicines or medical devices probably includes the following two modes: self-management and agency. In the distribution channel of pharmaceutical enterprises, accurate data of product flow (such as drugs or medical devices) has become the basis for decision and control of pharmaceutical enterprises.
Pharmaceutical enterprises can collect flow direction data for marketing managers to learn inventory data in distribution channels per month and sales data of market terminals, thereby making sales plans adapted to future product needs. The flow direction data collected by the pharmaceutical enterprise has the characteristics of huge data volume, multiple data sources, low data quality and the like.
Disclosure of Invention
In view of the above, embodiments of the present disclosure are directed to a method, an apparatus, a computer device, a storage medium, and a computer program product for determining target data, so as to solve the technical problem of low efficiency in a product flow data cleaning process in the conventional technology.
The embodiment of the specification provides a target data determination method, which comprises the following steps: receiving source data; the source data is for representing property data of a pharmaceutical machine product; dividing a plurality of mapping relation sets according to the attribute data of the medical instrument product; the mapping relation set comprises pre-data and candidate data with incidence relation;
matching the source data with the prepositive data in the corresponding mapping relation set; under the condition of failed matching, inquiring the source data in the candidate data set to obtain an inquiry result set; wherein the candidate data set comprises a plurality of candidate data; wherein the query result set includes at least one target candidate data; and determining target data matched with the source data in the target candidate data.
An embodiment of the present specification provides a target data determination method, including: providing a data processing page of source data; wherein the source data is for attribute data representing a pharmaceutical product; dividing a plurality of mapping relation sets according to the attribute data of the medical instrument product; the mapping relation set comprises pre-data and candidate data with incidence relation; displaying the processing information of the source data in the data processing page; wherein the processing information is generated based on a matching result between the source data and the preamble data in the corresponding mapping relation set; the processing information correspondingly has a matching operation control; under the condition that the matching operation control is triggered, showing at least one target candidate data included in the query result set; the target candidate data correspondingly has a matching confirmation control; the query result set is obtained by querying the source data in a candidate data set under the condition that the source data fails to be matched; wherein the candidate data set comprises a plurality of candidate data; determining target data matching the source data in the target candidate data if the match confirmation control is triggered.
The embodiment of the present specification provides a target data determination device, including: the source data receiving module is used for receiving source data; the source data is for representing property data of a pharmaceutical machine product; dividing a plurality of mapping relation sets according to the attribute data of the medical instrument product; the mapping relation set comprises pre-data and candidate data with incidence relation; the source data matching module is used for matching the source data with the prepositive data in the corresponding mapping relation set; the source data query module is used for querying the source data in the candidate data set under the condition of failed matching to obtain a query result set; wherein the candidate data set comprises a plurality of candidate data; wherein the query result set includes at least one target candidate data; and the target data determining module is used for determining target data matched with the source data in the target candidate data.
The embodiment of the present specification provides a target data determination device, including: the processing page providing module is used for providing a data processing page of the source data; wherein the source data is for attribute data representing a pharmaceutical product; dividing a plurality of mapping relation sets according to the attribute data of the medical instrument product; the mapping relation set comprises pre-data and candidate data with incidence relation; the processing information display module is used for displaying the processing information of the source data in the data processing page; wherein the processing information is generated based on a matching result between the source data and the preamble data in the corresponding mapping relation set; the processing information correspondingly has a matching operation control; the query result display module is used for displaying at least one target candidate data included in the query result set under the condition that the matching operation control is triggered; the target candidate data correspondingly has a matching confirmation control; the query result set is obtained by querying the source data in a candidate data set under the condition that the source data fails to be matched; wherein the candidate data set comprises a plurality of candidate data; and the target data determining module is used for determining target data matched with the source data in the target candidate data under the condition that the matching confirmation control is triggered.
The present specification provides a computing device, comprising a memory and a processor, wherein the memory stores a computer program, and the processor implements the method steps of the above embodiments when executing the computer program.
The present specification provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the method steps in the above embodiments.
The present specification embodiments provide a computer program product, which includes instructions that, when executed by a processor of a computer device, enable the computer device to perform the method steps in the above embodiments.
According to the embodiment of the specification, the source data are received, the source data are matched with the preposed data in the corresponding mapping relation set, and under the condition that the matching is successful, the data matching is efficiently performed on part of the source data and the target data without manual participation, so that the automatic data cleaning is realized, and the data cleaning efficiency is improved. Under the condition of failed matching, continuing to query the source data in the candidate data set to obtain a query result set comprising at least one target candidate data, and determining the target data matched with the source data in the target candidate data to improve the accuracy of data matching.
Drawings
FIG. 1a is an interaction diagram illustrating a method for determining target data in a scenario example according to an embodiment;
FIG. 1b is a diagram illustrating an application environment of a target data determination method according to an embodiment;
FIG. 2 is a schematic flow chart illustrating a target data determination method according to an embodiment;
FIG. 3 is a schematic flow chart diagram illustrating a target data determination method according to an embodiment;
FIG. 4 is a block diagram of a target data determination apparatus according to an embodiment;
FIG. 5 is a block diagram of a target data determination device according to an embodiment;
fig. 6 is an internal structural diagram of a computer device according to an embodiment.
Detailed Description
The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art without any inventive work based on the embodiments in the present specification belong to the protection scope of the present specification.
In the following, some of the terms referred to in this specification are explained, and flow data is generated in the pharmaceutical industry sales link. The flow data includes at least one of sales related data, inventory related data, shipping related data, and procurement related data. Under the premise of being authorized by medical manufacturers, dealers, agents, drugstores and other institutions, flow direction data in a sales link of the pharmaceutical industry can be collected and stored in a data warehouse. The medical manufacturer may be a pharmaceutical facility or enterprise that produces and sells medical products. Medical and mechanical products (which may be simply referred to as products) include pharmaceutical products and medical device products. An institution may be understood as an entity of the institution involved in the circulation of the product, the type of institution comprising at least one of a hospital, a pharmacy, a distributor, an agent, other institution. Hospitals are understood to be real world health care facilities. A pharmacy may be understood as a real world pharmacy comprising a chain headquarters. A distributor is understood to refer generally to an entity responsible for the circulation and distribution of a pharmaceutical product. An agent is understood to refer generally to an organization responsible for the sale of a pharmaceutical product. Other means are understood to be some that are not mentioned above during circulation of the product.
The source data may be source authority data in the streaming data, the source authority data including at least one of a source authority name, a source authority code, and source authority address information. The source data may be source product data in the streaming data, the source product data including at least one of a source product name, a source product code, a source product grade, a source product commodity name. The source data may be source unit data in the stream data, the source unit data including product units.
The streaming data may include that dealer a delivers X units of quantity of product C to agent B, the units of product C being any of boxes, bags, boxes. Dealer A, agent B may be the source authority data in the streaming data. Product C may be source product data in the streaming data. The X number of units may be source unit data in the streaming data.
In some embodiments, the same institutional entity is referred to differently by different populations. For example, "the tenth national hospital in Suzhou city", for the people in Suzhou city, the "ten hospitals" can be understood as the tenth national hospital in Suzhou city through the simple calling method; for part of the population outside Suzhou city, the name "Suzhou ten hospital" can be understood as the tenth people hospital in Suzhou city through another simple province; however, hearing "ten homes" or "ten Suzhou homes" for some other population is not able to think of the tenth national hospital in Suzhou. It is understood that "ten hospitals", "the tenth national hospital of Suzhou city" and "ten hospitals of Suzhou" all refer to the same institutional entity. Therefore, a mapping relationship set needs to be constructed, such as mapping relationships between "ten hospitals" and "the tenth national hospital of Suzhou city". A mapping relationship is established between 'Suzhou Ten hospitals' and 'Suzhou tenth national hospital'. The constructed mapping relation set comprises the prepositive data and the candidate data with the incidence relation. For example, the lead data may be "ten hospitals", "suzhou ten hospitals", and the candidate data may be "suzhou tenth national hospital". The preposition data can be understood as different names or calling laws of different people for the same entity, and can cause some ambiguity due to the change of people. The candidate data can be understood as a complete name for the entity to a certain extent, and can uniquely point to the entity without causing any ambiguity due to the change of people. It is to be understood that, in the present embodiment, the mechanism name is merely used as an example for illustration, and similarly, the names of the same product entity may not be uniform, and a set of product mapping relationships is formed by establishing mapping relationships, so that different names all point to the same product entity. In some embodiments, the units of the product may be unified, and different sales objects of the agent, the distributor and the pharmacy are different, and different sales objects have different demands, and different product units are used. For example, the dealer may transfer products to the agent in units of boxes, the agent may transfer products to the pharmacy in units of packs, and the pharmacy sales products may be in units of boxes. In summary, the property data of the pharmaceutical product may be organization name, product unit. A plurality of mapping relation sets are divided according to the attribute data of the medical instrument product, and can be a name mapping relation set, a product mapping relation set and a unit mapping relation set. It should be noted that the name mapping relationship set includes the pre-mechanism data and the standard mechanism name in which an association relationship exists. The product mapping relation set comprises pre-product data and standard product names with association relations. The unit mapping relation set comprises the preposed unit data and the standard product units with the incidence relation.
In some embodiments, the candidate data set belongs to a target enterprise that produces product C. The candidate data set may be understood as an enterprise database of the target enterprise. Candidate data may be understood as data stored in an enterprise database. For example, the data stored in the enterprise database may include information such as a standard organization name, a standard organization code, a standard product name, a standard product code, a standard unit, and the like. The enterprise database may also store organization attribute information, for example, when the organization is a hospital, the enterprise database may store information such as trimethyl, public standing, private standing, etc.
Please refer to fig. 1 a. In a specific scenario example, a candidate data set and a mapping relation set are deployed in advance on a server. The user can access a web page provided by the server through the terminal. And displaying a data uploading page on an operation interface of the terminal. The data uploading page is provided with a data importing control, the data can be uploaded to a server in a mode of designating a flow data storage path or dragging the flow data through the data importing control, the server obtains the flow data, and the flow data can enter a data cleaning process. The file type of the streaming data may be an Excel file or a ZIP package. The flow data includes source data representing property data of the pharmaceutical product. Dividing a plurality of mapping relation sets according to the attribute data of the medical instrument product; the mapping relation set comprises the prepositive data and the candidate data with the incidence relation. The source data may be at least one of source organization data, source product data, and source unit data. The set of mapping relationships may be at least one of a set of name mapping relationships, a set of product mapping relationships, and a set of unit mapping relationships. The data cleansing process may include at least one of an organization matching process, a product matching process, and a unit matching process.
The mapping relation set which is deployed in advance on the server comprises the prepositive data and the candidate data which have the incidence relation, the server acquires the source data in the flow data, and the source data is used for representing the attribute data of the medical instrument product. The attribute data represented by the source data corresponds to a set of mappings. And the server queries the mapping relation set corresponding to the source data by using the source data, namely, the source data is matched with the preposed data in the corresponding mapping relation set, and a matching result is returned to the terminal.
The terminal may provide a data processing page of the source data. And displaying the processing information of the source data in the data processing page. Wherein the processing information is generated based on a matching result between the source data and the preamble data in the corresponding mapping relation set. In some embodiments, the processing information includes processed results and pending results. The processed result includes the number of pieces of source data successfully matched to the preamble data in the corresponding set of mappings. The processed data may be source data that successfully matches to the preamble data in the corresponding set of mappings. The to-be-processed result comprises the number of pieces of source data which are not matched with the pre-positioned data in the corresponding mapping relation set. The data to be processed may be source data that is not matched to the preamble data in the corresponding set of mapping relationships. The data to be processed correspondingly has a matching operation control.
The terminal monitors the matching operation control, and sends a matching operation instruction to the server when the terminal monitors that the matching operation control is triggered. And the matching operation instruction carries the query keyword corresponding to the active data. And the server queries in the candidate data set according to the query keyword corresponding to the source data to obtain a query result set. Wherein the candidate data set comprises a plurality of candidate data;
and the server returns the query result set to the terminal. The query result set comprises at least one target candidate data; and the terminal displays at least one target candidate data included in the query result set. And the target candidate data correspondingly has a matching confirmation control.
And the terminal monitors the matching confirmation control, selects target data matched with the source data from the target candidate data under the condition that the matching confirmation control is triggered through monitoring, and sends a matching confirmation instruction to the server. The matching confirmation instruction carries the determined target data and the source data. The server establishes an incidence relation between the source data and the determined target data so as to update the mapping relation set.
Referring to fig. 1b, the present specification provides a flow direction data cleansing system, and the target data determination method provided herein is applied to the flow direction data cleansing system. The streaming data cleansing system may include a hardware environment formed by the terminal 110 and the server 120. The terminal 110 communicates with the server 120 through a network. The server 120 is pre-deployed with a mapping relationship set and a candidate data set. The candidate data set includes a plurality of candidate data. The mapping relation set corresponds to the attribute data of the medical instrument product, and the mapping relation set comprises the prepositive data and the candidate data with the association relation. Specifically, the terminal 110 uploads the source data to the server 120, and the server 120 matches the source data with the pre-positioned data in the corresponding mapping relationship set. In the event of a failure to match, the server 120 queries the source data in the candidate data set to obtain a query result set. Wherein the query result set includes at least one target candidate data. And determining target data matched with the source data in the target candidate data.
The terminal 110 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server 120 may be implemented as a stand-alone server or a server cluster composed of a plurality of servers. With the development of scientific technology, new computing devices, such as quantum computing servers, may be developed, and may also be applied to the embodiments of the present specification.
Referring to fig. 2, an embodiment of the present disclosure provides a method for determining target data. The target data determination method includes the following steps.
And S210, receiving source data.
Wherein the source data is for attribute data representing a pharmaceutical product. In particular, in some embodiments, source data uploaded by a terminal to a server may be received. The file type of the source data may be an EXCEL file or a ZIP package.
In some embodiments, a cleansing instruction sent by the terminal to the server to stream to the data may be received. The cleaning instruction carries active data, which may be a name indicating the institution to which the pharmaceutical product is associated. The source data may be a product name representing a pharmaceutical product. The source data may be units representing a pharmaceutical machine product. In some implementations, the source data can be keywords. Wherein the keywords may include organization names. The keywords may include product name, organization name, product specification. Keywords may include product name, product unit, product specification.
And S220, matching the source data with the preposed data in the corresponding mapping relation set.
Wherein the source data is for attribute data representing a pharmaceutical product. Different attributes correspond to different mapping relation sets, namely a plurality of mapping relation sets are divided according to the attribute data of the medical instrument product. The set of mapping relationships may be at least one of a set of name mapping relationships, a set of product mapping relationships, and a set of unit mapping relationships. The mapping relation set comprises the prepositive data and the candidate data with the incidence relation. In some embodiments, the lead data and the candidate data are directed to the same drug mechanism entity. It should be noted that the prefix and candidate may be different names for the same drug mechanism. The lead data and the candidate data may be different names for the same pharmaceutical product. The lead data and the candidate data may be of the same pharmaceutical product in different packaging units.
Pre-data may be understood as "dirty data" generated on the pharmaceutical industry sales link to flow to data. The prefix data may be an abbreviated name of the mechanism of the medication instrument mechanism or an alias of the mechanism of the medication instrument mechanism. The prefix data may be an abbreviated name of the product name of the pharmaceutical product, a trade name of the pharmaceutical product. The preamble may be the name of the standard amount used to meter the pharmaceutical machine product. The candidate data may be the name of the pharmaceutical enterprise to the medical facility. To a certain extent, the candidate data can also be understood as a standard institution name of a medical institution, a standard product name of a medical product, a standard product unit of a medical product.
Specifically, the source data is used for representing attribute data of the medical instrument product, the source data is used for inquiring in a mapping relation set corresponding to the source data, the source data is matched with the preposed data in the corresponding mapping relation set, and whether the source data is consistent with the preposed data in the corresponding mapping relation set or not is judged. If the two are consistent, the source data is judged to be successfully matched in the corresponding mapping relation set. And if the two are not consistent, judging that the source data fails to be matched in the corresponding mapping relation set.
And S230, under the condition of failure in matching, inquiring source data in the candidate data set to obtain an inquiry result set.
Wherein the candidate data set may be a set of candidate data. The candidate data set includes a plurality of candidate data. The candidate data may be understood as a standard institution name of a medical institution, a standard product name of a medical product, a standard product unit of a medical product defined by a pharmaceutical industry. Candidate data may also be understood as standard agency names, standard product units required by the medical regulatory authorities. In some implementations, the candidate data set includes candidate data in a set of mapping relationships.
Specifically, under the condition that matching fails, the source data is used for inquiring in the candidate data, and at least one target candidate data corresponding to the source data is recalled. Wherein the at least one target candidate data may constitute a query result set. In some implementations, at least one target candidate in the query result set is ranked with confidence.
And S240, determining target data matched with the source data in the target candidate data.
Specifically, a query result set is obtained by querying the source data in the candidate data set. In some embodiments, the terminal may present the target candidate data in the query result set. And in response to the matching confirmation operation on any target candidate data, taking the confirmed target candidate data as target data matched with the source data in the query result set. In some embodiments, the terminal may present the target candidate data in the query result set. And responding to the matching confirmation operation of any target candidate data, and sending a matching confirmation instruction to the server by the terminal. The matching confirmation instruction carries confirmed target candidate data, and the server takes the confirmed target candidate data as target data matched with the source data. Further, the source data may also be scrubbed with the target data.
According to the target data determining method, the source data are matched with the preposed data in the corresponding mapping relation set by receiving the source data, and under the condition that the matching is successful, the data matching is efficiently performed on part of the source data and the target data without manual participation, so that the automatic data cleaning is realized, and the data cleaning efficiency is improved. Under the condition of failed matching, continuing to query the source data in the candidate data set to obtain a query result set comprising at least one target candidate data, and determining the target data matched with the source data in the target candidate data to improve the accuracy of data matching. Further, when the enterprise performs data analysis by using the cleaned flow direction data, the accuracy of a data analysis result is improved, and decision making and product management and control of the enterprise are facilitated.
In some embodiments, the target data determination method may further include: and establishing an incidence relation between the source data and the determined target data so as to update the mapping relation set.
Specifically, in the current data cleansing process, the source data does not have successfully matched target data in the corresponding set of mapping relationships. The source data needs to be queried in the candidate data set to obtain a query result set, and target data matched with the source data is determined in the query result set. It will be appreciated that by querying the source data in the candidate data set, the target data that matches the source data has been specified. Further, it is known that the source data matches the determined target data, or that there is an association between the source data and the determined target data. Accordingly, an associative relationship is established between the source data and the determined target data to update the set of mapping relationships. In the embodiment, by updating the mapping relation set, when the source data enters the data cleaning process next time, the source data is matched in the updated mapping relation set, so that the target data can be successfully matched, the query in the candidate data set is not needed, and the data cleaning efficiency is improved.
In some implementations, the source data includes source enterprise data, the set of mappings includes a set of name mappings, and the candidate data set includes an enterprise data set for the target enterprise. Matching the source data with the pre-positioned data in the corresponding mapping relationship set may include: and matching the source mechanism data with the prepositive mechanism data in the name mapping relation set. Accordingly, in the case of a failure in matching, querying the source data in the candidate data set to obtain a query result set, which may include: and under the condition that the source institution data matching fails, inquiring the source institution data in the enterprise data set, and recalling an institution name set corresponding to the source institution data.
Specifically, source data is received, wherein the source data comprises source institution data. And the mapping relation set corresponding to the source mechanism data is a name mapping relation set. The name mapping relation set comprises the pre-mechanism data with the association relation and the candidate mechanism name. And matching the source mechanism data with the prepositive mechanism data in the name mapping relation set. And when the preposition mechanism data consistent with the source mechanism data does not exist in the name mapping relation set, indicating that the matching fails. It is further desirable to query the enterprise data set for at least one target organization name corresponding to the source organization data based on the source organization data. The at least one target institution name constitutes an institution name set. Further, the organization name set includes at least one target candidate organization data, and the target organization data is determined among the target candidate organization data.
In some embodiments, the candidate dataset is an enterprise dataset for the target enterprise, the enterprise dataset including a candidate organization name. The enterprise data set includes candidate organization names in a set of name mappings.
In some embodiments, when the pre-mechanism data consistent with the source mechanism data exists in the name mapping relationship set and indicates that the matching is successful, the candidate mechanism data having an association relationship with the pre-mechanism data can be determined as the target mechanism data. The data cleaning method and the data cleaning device can reduce data cleaning work of flow direction data and improve data cleaning efficiency. Further, after determining the target facility data, a name association relationship may be automatically established between the source facility data and the determined target facility data to update the set of name mapping relationships. In some implementations, the name associations that have been established can be exposed and correspond to the relationship modification controls. And the relationship modification control comprises a re-matching control and a canceling matching control. And monitoring the relation modification control, and if the relation modification control is triggered, responding to a relation modification instruction and modifying the name association relation.
In this embodiment, first, the source organization data is matched with the front-end organization data in the name mapping relationship set to quickly match the target organization data corresponding to the source organization data. And then, under the condition that the source organization data is unsuccessfully matched, inquiring the source organization data in the enterprise data set, and recalling the organization name set corresponding to the source organization data so as to ensure that the target organization name corresponding to the source organization data is accurately determined and improve the accuracy of data cleaning.
In some implementations, the set of name mappings can also be referred to as organization matching relationship data. The name mapping relation set can be stored to the server in a batch import mode. When the mechanism matching relationship data is stored in a batch import manner, the mechanism matching relationship template can be acquired from the server, and the mechanism matching relationship data is generated according to the mechanism matching relationship template. In some implementations, the organization matching relationship template can be provided to the user using an EXCEL file. The agency matching relationship template may include a dealer code, a dealer name, an original agency name, a standard agency code, and a standard agency name. In some embodiments, the name mapping relationship set may also be stored to the server in a single added manner. And a newly added control with a matching relationship is provided in an interface of the terminal. And displaying the newly added matching relationship page when the terminal monitors that the newly added matching relationship control is triggered. And determining the source mechanism data and the target mechanism data through the newly added matching relationship page, and establishing a name mapping relationship between the source mechanism data and the target mechanism data.
In some embodiments, the target enterprise also has a corporate alias dataset. The target data determination method may further include: and under the condition that the enterprise data set is not queried to obtain the organization name set, querying in the organization alias data set according to the source organization data, and recalling the organization name set corresponding to the source organization data.
Wherein the organization alias dataset is used for storing the organization alias and the corresponding relation between the organization alias and the candidate organization name. Specifically, source data is received, wherein the source data comprises source institution data. And the mapping relation set corresponding to the source mechanism data is a name mapping relation set. The name mapping relation set comprises the pre-mechanism data with the association relation and the candidate mechanism name. And matching the source mechanism data with the prepositive mechanism data in the name mapping relation set. And when the preposition mechanism data consistent with the source mechanism data does not exist in the name mapping relation set, indicating that the matching fails. There is a need to further query the enterprise data set based on the source organization data. And under the condition that the enterprise data set is not queried to obtain the mechanism name set, querying in the mechanism alias data set by using the source mechanism data to obtain the mechanism name set corresponding to the source mechanism data. Further, the organization name set includes at least one target candidate organization data, and the target organization data is determined among the target candidate organization data. In the embodiment, the organization name set corresponding to the source organization data is recalled by further inquiring in the organization alias data set, so that accurate candidate organization names are comprehensively provided for users, and more accurate mapping relation is favorably established.
In some embodiments, the target enterprise belongs to a target industry, the target industry having an industry dataset. The target data determination method may further include: and under the condition that the source institution data is failed to be matched, inquiring the source institution data in the industry data set, and recalling an institution name set corresponding to the source institution data.
Specifically, source data is received, wherein the source data comprises source institution data. And the mapping relation set corresponding to the source mechanism data is a name mapping relation set. The name mapping relation set comprises the pre-mechanism data with the association relation and the candidate mechanism name. And matching the source mechanism data with the prepositive mechanism data in the name mapping relation set. And when the preposition mechanism data consistent with the source mechanism data does not exist in the name mapping relation set, indicating that the matching fails. Further, the industry data set needs to be queried according to the source organization data, and the organization name set corresponding to the source organization data is recalled. Further, the organization name set includes at least one target candidate organization data, and the target organization data is determined among the target candidate organization data.
In some embodiments, when neither the enterprise nor the enterprise alias dataset is queried for the organization name set, the source organization data is utilized to query the industry dataset to obtain the organization name set corresponding to the source organization data.
In some embodiments, the target data determination method may further include: and in the case of successful matching, determining candidate data associated with the prepositive data successfully matched with the source data as target data. The target mechanism data corresponding to the source mechanism data are matched quickly, human intervention in the data cleaning process is reduced, and accuracy and efficiency of data cleaning are improved.
In some embodiments, the source data includes at least one of source product data, source unit data. The mapping relation set comprises at least one of a product mapping relation set and a unit mapping relation set. And matching the source data with the prepositive data in the corresponding mapping relation set, wherein the matching comprises at least one of the following steps. And matching the source product data with the preposed product data in the product mapping relation set. Alternatively, the source unit data is matched with the leading unit data in the unit mapping relationship set.
In particular, in some embodiments, the source data includes source product data. And the mapping relation set corresponding to the product name attribute is a product mapping relation set. The product mapping relation set comprises pre-product data and candidate product names with association relations. And matching the source product data with the preposed product data in the product mapping relation set. And when the preposed product data consistent with the source product data does not exist in the product mapping relation set, indicating that the matching fails. Further, the enterprise data set is queried according to the source product data, the organization name, and the product specification, and the corresponding at least one target product name is recalled. The at least one target product name constitutes a set of product names. Further, the product name set includes at least one target candidate product data, and the target product data is determined in the target candidate product data.
In some embodiments, the source data comprises source unit data. And the mapping relation set corresponding to the product unit attribute is a unit mapping relation set. The unit mapping relation set comprises leading unit data and candidate product units with incidence relations. And matching the source unit data with the prepositive unit data in the unit mapping relation set. And when the unit mapping relation set does not have the front unit data consistent with the source unit data, indicating that the matching fails. Further, the enterprise data set needs to be queried according to the name of the source product, the product unit, and the product specification, and the corresponding at least one target product unit is recalled. The at least one target product unit constitutes a set of product units. Further, the set of product units includes at least one target candidate product unit, and the target product unit is determined among the target candidate product units.
In the embodiment, the source product data in the flow data is directly cleaned into the target product data through the product mapping relation set; the source unit data in the flow data are directly cleaned into target product units through the unit mapping relation set, so that manual intervention is reduced, and the accuracy and efficiency of data cleaning are improved.
The embodiment of the specification provides a target data determination method which comprises the following steps.
S302, receiving source data.
Wherein the source data is for attribute data representing a pharmaceutical product. A plurality of sets of mapping relationships are partitioned according to the property data of the pharmaceutical product. The mapping relation set comprises the prepositive data and the candidate data with the incidence relation. The lead data and the candidate data point to the same pharmaco-mechanical entity.
In some embodiments, the lead data and the candidate data are directed to the same drug mechanism entity. The candidate data set includes candidate data in the set of mapping relationships. The source data includes source organization data, source product data, and source unit data.
In some embodiments, the set of mapping relationships includes a set of name mapping relationships, a set of product mapping relationships, and a set of unit mapping relationships. The name mapping relation set comprises pre-mechanism data and candidate mechanism data with incidence relations. The product mapping relation set comprises pre-product data and candidate product data with incidence relations. The unit mapping relation set comprises leading unit data and candidate unit data which have association relation.
In some embodiments, the candidate data set includes an enterprise data set of the target enterprise. The target enterprise also has a corporate alias dataset. The target enterprise belongs to a target industry, and the target industry is provided with an industry data set.
And S304, matching the source mechanism data with the prepositive mechanism data in the name mapping relation set.
And S306, under the condition that the source institution data are successfully matched, determining candidate institution data associated with the prepositive institution data successfully matched with the source institution data as target institution data.
And S308, under the condition that the source institution data matching fails, inquiring the source institution data in the enterprise data set, and recalling an institution name set corresponding to the source institution data.
Wherein the organization name set includes at least one target candidate organization data.
In some embodiments, in the event that no facility name set is queried in the enterprise dataset, a facility name set corresponding to the source facility data is recalled by querying in the facility alias dataset against the source facility data.
In some embodiments, in the event of a failure to match the source organization data, the source organization data is queried in the business data set to recall the set of organization names corresponding to the source organization data.
And S310, determining target institution data matched with the source institution data in the target institution candidate data.
And S312, establishing an association relationship between the source organization data and the determined target organization data so as to update the name mapping relationship set.
And S314, matching the source product data with the preposed product data in the product mapping relation set.
And S316, under the condition that the source product data are successfully matched, determining candidate product data associated with the preposed product data successfully matched with the source product data as target product data.
And S318, under the condition that the source product data are not matched, inquiring the source product data in the enterprise data set, and recalling a product name set corresponding to the source product data.
Wherein the set of product names includes at least one target candidate product data.
And S320, determining target product data matched with the source product data in the target candidate product data.
S322, establishing an incidence relation between the source product data and the determined target product data so as to update the product mapping relation set.
And S324, matching the source unit data with the preposed unit data in the unit mapping relation set.
S326, if the source unit data match successfully, determines the candidate unit data associated with the leading unit data successfully matched with the source unit data as the target unit data.
And S328, under the condition that the source unit data are not matched, inquiring the source unit data in the enterprise data set, and recalling the product unit set corresponding to the source unit data.
Wherein the set of product units includes at least one target candidate product unit.
And S330, determining target product units matched with the source unit data in the target candidate unit data.
S332, establishing an incidence relation between the source unit data and the determined target product units so as to update the unit mapping relation set.
Referring to fig. 3, an embodiment of the present disclosure provides a method for determining target data. The target data determination method includes the following steps.
And S410, providing a data processing page of the source data.
Wherein the source data is for representing property data of the pharmaceutical product; dividing a plurality of mapping relation sets according to the attribute data of the medical instrument product; the mapping relation set comprises the prepositive data and the candidate data with the incidence relation.
And S420, displaying the processing information of the source data in the data processing page.
Wherein the processing information is generated based on a matching result between the source data and the prepositive data in the corresponding mapping relation set; the processing information corresponds to the matching operation control.
And S430, under the condition that the matching operation control is triggered, showing at least one target candidate data included in the query result set.
The target candidate data correspondingly has a matching confirmation control; the query result set is obtained by querying the source data in the candidate data set under the condition that the source data matching fails; wherein the candidate data set comprises a plurality of candidate data.
And S440, under the condition that the matching confirmation control is triggered, determining target data matched with the source data in the target candidate data.
In some embodiments, processing the information comprises: processed results and/or results to be processed; wherein the processed result is generated based on the number of source data successfully matched to the preamble data in the corresponding set of mappings; the result to be processed is generated by the number of source data which are not matched with the pre-data in the corresponding mapping relation set.
For specific limitations of the target data determination method applied to the terminal, reference may be made to the above limitations of the target data determination method, which is not described herein again.
It should be understood that, although the steps in the above-described flowcharts are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the above-mentioned flowcharts may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or the stages is not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a part of the steps or the stages in other steps.
Referring to fig. 4, an embodiment of the present disclosure provides a target data determination apparatus 400. The determination apparatus 400 includes a source data receiving module 410, a source data matching module 420, a source data querying module 430, and a target data determining module 440.
A source data receiving module 410 for receiving source data; the source data is for representing property data of the pharmaceutical machine product; dividing a plurality of mapping relation sets according to the attribute data of the medical instrument product; the mapping relation set comprises the prepositive data and the candidate data with the incidence relation.
And the source data matching module 420 is configured to match the source data with the pre-positioned data in the corresponding mapping relationship set.
A source data query module 430, configured to query source data in the candidate data set to obtain a query result set when matching fails; wherein the candidate data set comprises a plurality of candidate data; wherein the query result set includes at least one target candidate data.
And a target data determining module 440, configured to determine target data matching the source data from the target candidate data.
Referring to fig. 5, an embodiment of the present disclosure provides a target data determination apparatus 500. The determining apparatus 500 includes a processing page providing module 510, a processing information presentation module 520, a query result presentation module 530, and a target data determining module 540.
A processing page providing module 510 for providing a data processing page of the source data; wherein the source data is for representing property data of the pharmaceutical product; dividing a plurality of mapping relation sets according to the attribute data of the medical instrument product; the mapping relation set comprises the prepositive data and the candidate data with the incidence relation.
A processing information presentation module 520, configured to present processing information of the source data in the data processing page; wherein the processing information is generated based on a matching result between the source data and the prepositive data in the corresponding mapping relation set; the processing information corresponds to the matching operation control.
A query result presentation module 530, configured to present at least one target candidate data included in the query result set when the matching operation control is triggered; the target candidate data correspondingly has a matching confirmation control; the query result set is obtained by querying the source data in the candidate data set under the condition that the source data matching fails; wherein the candidate data set comprises a plurality of candidate data.
And a target data determining module 540, configured to determine, in the case that the matching confirmation control is triggered, target data that matches the source data from among the target candidate data.
For specific limitations of the target data determination device, reference may be made to the above limitations of the target data determination method, which are not described herein again. The respective modules in the above target data determination device may be implemented in whole or in part by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In some embodiments, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 6. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a target data determination method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing device to which the disclosed aspects apply, and that a computing device may in particular include more or less components than those shown, or combine certain components, or have a different arrangement of components.
In some embodiments, a computer device is provided, comprising a memory having a computer program stored therein and a processor that, when executing the computer program, performs the method steps of the above embodiments.
In some embodiments, a computer-readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the method steps in the above-described embodiments.
In some embodiments, a computer program product is also provided, which comprises instructions that are executable by a processor of a computer device to implement the method steps in the above-described embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The features of the above embodiments may be arbitrarily combined, and for the sake of brevity, all possible combinations of the features in the above embodiments are not described, but should be construed as being within the scope of the present specification as long as there is no contradiction between the combinations of the features.
The above description is only for the purpose of illustrating the preferred embodiments of the present disclosure and is not to be construed as limiting the present disclosure, and any modifications, equivalents and the like that are within the spirit and principle of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (15)

1. A method for target data determination, the method comprising:
receiving source data; the source data is for representing property data of a pharmaceutical machine product; dividing a plurality of mapping relation sets according to the attribute data of the medical instrument product; the mapping relation set comprises pre-data and candidate data with incidence relation;
matching the source data with the prepositive data in the corresponding mapping relation set;
under the condition of failed matching, inquiring the source data in the candidate data set to obtain an inquiry result set; wherein the candidate data set comprises a plurality of candidate data; wherein the query result set includes at least one target candidate data;
and determining target data matched with the source data in the target candidate data.
2. The method of claim 1, further comprising:
and establishing an incidence relation between the source data and the determined target data so as to update the mapping relation set.
3. The method of claim 1, wherein the source data comprises source enterprise data, the set of mappings comprises a set of name mappings, and the candidate set of data comprises an enterprise set of data for a target enterprise; the matching the source data with the prepositive data in the corresponding mapping relation set comprises:
matching the source mechanism data with the prepositive mechanism data in the name mapping relation set;
under the condition that the matching fails, inquiring the source data in the candidate data set to obtain an inquiry result set, wherein the inquiry result set comprises:
and under the condition that the source institution data is failed to be matched, inquiring the source institution data in the enterprise data set, and recalling an institution name set corresponding to the source institution data.
4. The method of claim 3, wherein the target enterprise further has a corporate alias dataset; the method further comprises the following steps:
and under the condition that the enterprise data set is not queried to obtain the organization name set, querying in the organization alias data set according to the source organization data, and recalling the organization name set corresponding to the source organization data.
5. The method of claim 3, wherein the target business belongs to a target industry, the target industry having an industry dataset; the method further comprises the following steps:
and under the condition that the source institution data is failed to be matched, inquiring the source institution data in the industry data set, and recalling an institution name set corresponding to the source institution data.
6. The method according to any one of claims 1 to 5, further comprising:
and in the case of successful matching, determining candidate data associated with the prepositive data successfully matched with the source data as the target data.
7. The method of any of claims 1 to 5, wherein the source data comprises at least one of source product data, source unit data; the mapping relation set comprises at least one of a product mapping relation set and a unit mapping relation set; the matching of the source data and the prepositive data in the corresponding mapping relation set at least comprises one of the following steps:
matching the source product data with preposed product data in the product mapping relation set;
and matching the source unit data with the preposed unit data in the unit mapping relation set.
8. The method of any one of claims 1 to 5, wherein the lead data and the candidate data are directed to the same pharmaco-mechanical entity;
the candidate data set comprises candidate data in the set of mapping relationships.
9. A method for target data determination, the method comprising:
providing a data processing page of source data; wherein the source data is for attribute data representing a pharmaceutical product; dividing a plurality of mapping relation sets according to the attribute data of the medical instrument product; the mapping relation set comprises pre-data and candidate data with incidence relation;
displaying the processing information of the source data in the data processing page; wherein the processing information is generated based on a matching result between the source data and the preamble data in the corresponding mapping relation set; the processing information correspondingly has a matching operation control;
under the condition that the matching operation control is triggered, showing at least one target candidate data included in the query result set; the target candidate data correspondingly has a matching confirmation control; the query result set is obtained by querying the source data in a candidate data set under the condition that the source data fails to be matched; wherein the candidate data set comprises a plurality of candidate data;
determining target data matching the source data in the target candidate data if the match confirmation control is triggered.
10. The method of claim 9, wherein the processing information comprises: processed results and/or results to be processed; wherein the processed result is generated based on the number of source data successfully matched to the preamble data in the corresponding set of mappings; the to-be-processed result is generated according to the quantity of the source data which are not matched with the prepositive data in the corresponding mapping relation set.
11. An apparatus for determining target data, the apparatus comprising:
the source data receiving module is used for receiving source data; the source data is for representing property data of a pharmaceutical machine product; dividing a plurality of mapping relation sets according to the attribute data of the medical instrument product; the mapping relation set comprises pre-data and candidate data with incidence relation;
the source data matching module is used for matching the source data with the prepositive data in the corresponding mapping relation set;
the source data query module is used for querying the source data in the candidate data set under the condition of failed matching to obtain a query result set; wherein the candidate data set comprises a plurality of candidate data; wherein the query result set includes at least one target candidate data;
and the target data determining module is used for determining target data matched with the source data in the target candidate data.
12. An apparatus for determining target data, the apparatus comprising:
the processing page providing module is used for providing a data processing page of the source data; wherein the source data is for attribute data representing a pharmaceutical product; dividing a plurality of mapping relation sets according to the attribute data of the medical instrument product; the mapping relation set comprises pre-data and candidate data with incidence relation;
the processing information display module is used for displaying the processing information of the source data in the data processing page; wherein the processing information is generated based on a matching result between the source data and the preamble data in the corresponding mapping relation set; the processing information correspondingly has a matching operation control;
the query result display module is used for displaying at least one target candidate data included in the query result set under the condition that the matching operation control is triggered; the target candidate data correspondingly has a matching confirmation control; the query result set is obtained by querying the source data in a candidate data set under the condition that the source data fails to be matched; wherein the candidate data set comprises a plurality of candidate data;
and the target data determining module is used for determining target data matched with the source data in the target candidate data under the condition that the matching confirmation control is triggered.
13. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor realizes the steps of the method of any one of claims 1 to 10 when executing the computer program.
14. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 10.
15. A computer program product comprising instructions, characterized in that said instructions, when executed by a processor of a computer device, enable said computer device to perform the steps of the method according to any one of claims 1 to 10.
CN202111308637.3A 2021-11-05 2021-11-05 Target data determination method and device, computer equipment and storage medium Pending CN114020733A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111308637.3A CN114020733A (en) 2021-11-05 2021-11-05 Target data determination method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111308637.3A CN114020733A (en) 2021-11-05 2021-11-05 Target data determination method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114020733A true CN114020733A (en) 2022-02-08

Family

ID=80061678

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111308637.3A Pending CN114020733A (en) 2021-11-05 2021-11-05 Target data determination method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114020733A (en)

Similar Documents

Publication Publication Date Title
US9529863B1 (en) Normalizing ingested data sets based on fuzzy comparisons to known data sets
CN112364004B (en) Data warehouse-based policy data processing method, device and storage medium
CN114168544A (en) Clinical test data processing method and device, computer equipment and storage medium
CN114582473A (en) Reservation management method, reservation management device, computer equipment and storage medium
CN110428342B (en) Data restoration method, server, customer service side and storage medium
CN117390011A (en) Report data processing method, device, computer equipment and storage medium
CN114020733A (en) Target data determination method and device, computer equipment and storage medium
CN110874365A (en) Information query method and related equipment thereof
CN111553749A (en) Activity push strategy configuration method and device
US10521597B2 (en) Computing device and method for input site qualification
US20220293254A1 (en) Automated data aggregation with file analysis and predictive modeling
CN116304251A (en) Label processing method, device, computer equipment and storage medium
EP4092610A1 (en) Information processing method, device, system, and computer-readable storage medium
CN114461895A (en) Medical information pushing method and device, computer equipment and storage medium
US9727621B2 (en) Systems and methods for servicing database events
CN114676359A (en) Form display method and device, computer equipment and storage medium
CN112231377A (en) Data mapping method, system, device, server and storage medium
CN113157890A (en) Intelligent question and answer method and device, electronic equipment and readable storage medium
CN113517050A (en) Method and device for determining prescription order, electronic equipment and storage medium
US11443835B1 (en) Methods and systems for processing data inquires
CN111008191A (en) Completion method and device for database query and computer equipment
US20180101662A1 (en) System and Method for Processing Clinical Trial Data
KR102432066B1 (en) Method and Server for Providing Web Service with Customer Compatibility using Matching Table related to Standardized Bill of Material
CN114896489B (en) Object recommendation information generation and display method and device, electronic equipment and medium
CN113435554A (en) Method, device, equipment and medium for managing and displaying information triggered by code scanning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination