CN115328948A - Master data quality management method, master data quality management device, computer equipment and storage medium - Google Patents

Master data quality management method, master data quality management device, computer equipment and storage medium Download PDF

Info

Publication number
CN115328948A
CN115328948A CN202210161602.XA CN202210161602A CN115328948A CN 115328948 A CN115328948 A CN 115328948A CN 202210161602 A CN202210161602 A CN 202210161602A CN 115328948 A CN115328948 A CN 115328948A
Authority
CN
China
Prior art keywords
rule
main data
data
field
verification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210161602.XA
Other languages
Chinese (zh)
Inventor
葛贵荣
张健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Meichuang Technology Co ltd
Original Assignee
Hangzhou Meichuang Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Meichuang Technology Co ltd filed Critical Hangzhou Meichuang Technology Co ltd
Priority to CN202210161602.XA priority Critical patent/CN115328948A/en
Publication of CN115328948A publication Critical patent/CN115328948A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24564Applying rules; Deductive queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors

Abstract

The embodiment of the invention discloses a method and a device for managing the quality of main data, computer equipment and a storage medium. The method comprises the following steps: configuring dictionary entries of the data; configuring a rule; configuring master data model information; configuring basic information of a main data model field; configuring rules for the primary data model fields; acquiring newly added or modified main data; carrying out rule verification on the main data to obtain a verification result; judging whether the checking result is passed or not; if yes, performing warehousing operation on the main data; if not, generating prompt information according to the rule that the main data does not conform to, sending the prompt information to the terminal so that the terminal modifies the main data, and executing the rule verification on the main data to obtain a verification result. By implementing the method provided by the embodiment of the invention, the problems that a large number of rules need to be manually configured, errors and omissions are easy to occur, and a large amount of time needs to be spent in the prior art can be solved.

Description

Master data quality management method, master data quality management device, computer equipment and storage medium
Technical Field
The present invention relates to a method for managing master data, and more particularly, to a method and apparatus for managing master data quality, a computer device, and a storage medium.
Background
The main data refers to the basic information of the organization mechanism which meets the collaborative needs of the cross-department business and reflects the state attribute of the core business entity. The main data management controls the main data value, so that an enterprise can use consistent and shared main data in a cross-system mode, provide coordinated and consistent high-quality main data from an authoritative data source, and reduce the cost and complexity, thereby supporting cross-department and cross-system data fusion application.
At present, a main data management system in the market mainly comprises data modeling, data integration, data management, data service, basic management, standard management and the like, but as main data of golden data, the functions related to main data quality management in a main data management process are simple. At present, in order to effectively improve the quality of main data, a common method is to purchase a set of data quality management system to manage the quality of data, and a main data model rule needs to be manually configured in the quality management system, that is, a code of the rule is embedded, but a large amount of time needs to be spent when a large amount of main data model rules are manually configured, and problems of untimely configuration, missed configuration of the rule, configuration errors and the like often occur.
Therefore, there is a need to design a new method to solve the problems of the prior art that a large number of rules need to be manually configured, the prior art is highly prone to errors and omissions, and a large amount of time is required.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a main data quality management method, a device, a computer device and a storage medium.
In order to realize the purpose, the invention adopts the following technical scheme: the main data quality management method comprises the following steps:
configuring dictionary entries of the data;
configuring a rule;
configuring master data model information;
configuring basic information of a main data model field;
configuring rules of the main data model field;
acquiring newly added or modified main data;
carrying out rule verification on the main data to obtain a verification result;
judging whether the checking result is passed or not;
if the verification result is that the verification is passed, performing warehousing operation on the main data;
if the verification result is that the verification is not passed, generating prompt information according to the rule which is not met by the main data, sending the prompt information to the terminal so that the terminal modifies the main data, and executing the rule verification on the main data to obtain the verification result.
The further technical scheme is as follows: the dictionary entries include dictionary names, dictionary descriptions, dictionary values, and a list of dictionary value descriptions.
The further technical scheme is as follows: the configuration rule comprises the following steps:
configuring rules in a built-in rule mode or a newly-established rule mode, wherein the built-in rule mode is to configure rule names, rule description, set identifiers for enabling the rules or not and select rule configuration modes, and the rule configuration modes comprise at least one of a dictionary mode and a regular expression mode; the new rule is created by creating at least one of a main key, a unique key, a foreign key, a length, a data format, a dictionary, a must fill and a value range.
The further technical scheme is as follows: the main data model information comprises a model English name, a model description and a model data type.
The further technical scheme is as follows: and the rules of the main data model field comprise a function of whether to start the automatically matched field built-in rules and a function of whether to start newly-built field rules.
The further technical scheme is as follows: performing rule verification on the main data to obtain a verification result, wherein the verification result comprises the steps of;
acquiring a corresponding rule;
filtering the rules, and screening out the opened rules to obtain target rules;
judging whether the SQL field in the information of the target rule is empty or not;
if the SQL field in the information of the target rule is empty, performing memory verification on the main data to obtain a verification result;
and if the SQL field in the information of the target rule is not empty, performing SQL verification on the main data to obtain a verification result.
The present invention also provides a master data quality management apparatus, including:
the first configuration unit is used for configuring dictionary items of the data;
a second configuration unit for configuring the rule;
a third configuration unit configured to configure the master data model information;
the fourth configuration unit is used for configuring the basic information of the main data model field;
a fifth configuration unit, configured to configure rules of the primary data model field;
an acquisition unit configured to acquire newly added or modified master data;
the verification unit is used for carrying out rule verification on the main data to obtain a verification result;
the judging unit is used for judging whether the checking result passes the checking;
the storage unit is used for carrying out storage operation on the main data if the verification result is that the verification is passed;
and the information generating unit is used for generating prompt information according to the rule that the main data does not conform to if the verification result is that the verification is not passed, sending the prompt information to the terminal so that the terminal modifies the main data, and executing the rule verification on the main data to obtain a verification result.
The further technical scheme is as follows: the second configuration unit is configured to configure the rule in a rule-built manner or a rule-newly-built manner, where the rule-built manner is to configure a rule name, a rule description, an identifier for setting whether the rule is enabled, and to select a rule configuration manner, where the rule configuration manner includes at least one of a dictionary manner and a regular expression manner; the new rule is created by creating at least one of a main key, a unique key, a foreign key, a length, a data format, a dictionary, a must fill and a value range.
The invention also provides computer equipment which comprises a memory and a processor, wherein the memory is stored with a computer program, and the processor realizes the method when executing the computer program.
The invention also provides a storage medium storing a computer program which, when executed by a processor, implements the method described above.
Compared with the prior art, the invention has the beneficial effects that: according to the method, when the model field is configured, a large number of rules can be automatically configured, rule information can be configured during model configuration, when main data is newly added or modified, rule verification is required, warehousing operation can be performed only when the main data passes the verification, and if the main data does not pass the verification, warehousing can be performed only when the main data meets the rules, so that the problems that in the prior art, a large number of rules need to be manually configured, errors and omissions are prone to occurring, and a large amount of time is needed are solved.
The invention is further described below with reference to the figures and the specific embodiments.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic view of an application scenario of a primary data quality management method according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of a primary data quality management method according to an embodiment of the present invention;
FIG. 3 is a sub-flow diagram of a method for quality management of primary data according to an embodiment of the present invention;
FIG. 4 is a schematic block diagram of a master data quality management apparatus provided by an embodiment of the present invention;
fig. 5 is a schematic block diagram of a verification unit of a primary data quality management apparatus according to an embodiment of the present invention;
FIG. 6 is a schematic block diagram of a computer device provided by an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting. As used in this specification and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
Referring to fig. 1 and fig. 2, fig. 1 is a schematic view of an application scenario of a primary data quality management method according to an embodiment of the present invention. Fig. 2 is a schematic flowchart of a primary data quality management method according to an embodiment of the present invention. The main data quality management method is applied to a server. The server and the terminal carry out data interaction, a large number of rules are automatically configured while model fields are configured, the rule configuration efficiency can be greatly improved, the rule configuration time is shortened, meanwhile, the rule information is configured while the models are configured, the timeliness, the correctness and the integrity of the rules can be improved, the rule configuration of the master data model is improved, the master data quality management process can be simplified, the master data quality is effectively improved, when newly added or modified master data input by the terminal is obtained, verification is carried out by combining the set rules, the master data passing the verification is subjected to warehousing operation, and otherwise, the prompt information is sent to the terminal.
Fig. 2 is a flowchart illustrating a primary data quality management method according to an embodiment of the present invention. As shown in fig. 2, the method includes the following steps S110 to S200.
And S110, configuring dictionary items of the data.
In this embodiment, the dictionary entry includes a dictionary name, a dictionary description, a dictionary value, and a dictionary value description list.
And S120, configuring rules.
In this embodiment, the configured rule refers to the enabling situation or the value taking situation of multiple defining conditions such as a main key, a unique key, a foreign key, a length, a data format, a dictionary, a must fill, a value taking range, and the like.
Specifically, the rule is configured in a rule built-in mode or a rule newly-built mode, wherein the rule built-in mode is that a rule name, a rule description, an identifier for setting whether the rule is enabled or not and a rule configuration mode are selected, and the rule configuration mode comprises at least one of a dictionary mode and a regular expression mode; the new rule is created by creating at least one of a main key, a unique key, a foreign key, a length, a data format, a dictionary, a must fill and a value range.
If the dictionary mode is selected, the configured dictionary item needs to be selected; if the regular expression mode is selected, the regular expression needs to be filled in. The rule configuration can be automatically newly built, the conditions such as a main key, a unique key, an outer key, length, a data format, a dictionary, necessary filling and a value range can be automatically newly built, if a certain condition is started, the corresponding condition is selected, the start key is clicked, and if the condition is not started, the corresponding condition is not selected.
S130, configuring main data model information.
In this embodiment, the main data model information includes a model english name, a model description, and a model data type. Wherein the model data type comprises optional main data and mapping data and remarks of the model.
And S140, configuring basic information of the main data model field.
In this embodiment, the basic information of the field includes: the field English name, the field comment, the field type, the field length, the field precision, whether the field is a primary key or not, whether the field is unique or not, whether the field is a self-increment column or not, associated primary data field, whether the field is necessary to be filled, field dictionary information, the field data type such as an optional character string, a numerical value, a date and time, the field value range such as an optional value range function with the range interval of more than, less than, equal to, not equal to, more than or equal to, less than or equal to, the range interval, the minimum value and the maximum value of the field value range, and the field data format.
Some basic information of the fields can be set by adopting a key-on mode, such as whether the fields are primary keys or not, whether the fields are unique or not, whether the fields are self-added and the like, and some basic information can be configured in a field input mode, so that the operation is convenient.
And S150, configuring rules of the main data model field.
In this embodiment, the rules of the main data model field include whether to start a function of automatically matching built-in rules of the field, and whether to start a function of a newly-built field rule.
Specifically, a field built-in rule automatically matched is configured, and setting can be performed by configuring whether the rule is started or not; configuring an automatically newly-built field rule, and setting by configuring whether the rule is started or not; manually configuring field rules, such as rule names, rule descriptions, whether the rules start identifiers or not and rule configuration modes, and if the rule configuration modes are selected to be customized, only filling in the customized RULESQL; if the rule configuration mode selects dictionary configuration, the configured dictionary item needs to be selected; if the regular expression is selected by the rule configuration mode, the regular expression needs to be filled in.
And S160, acquiring the newly added or modified main data.
In this embodiment, a master data manager newly adds and modifies master data, inputs the data through a terminal, automatically triggers rule verification before the master data are put in a warehouse, enters the master database when the master data pass the rule verification, displays the data failing the verification and a rule list failing the verification when the rule verification fails, acquires all the rule lists of the master data model during the rule verification, filters whether to enable a rule marked as yes, and verifies the master data by adopting the rule marked as yes.
S170, carrying out rule verification on the main data to obtain a verification result.
In this embodiment, the verification result indicates whether the main data meets the condition set by the rule.
In an embodiment, referring to fig. 3, the step S170 may include steps S171 to S175.
And S171, acquiring a corresponding rule.
In the present embodiment, the acquired rule refers to all the contents configured by the master data model, including all the contents set in step S110 to step S150.
And S172, filtering the rules, and screening out the opened rules to obtain the target rules.
In this embodiment, the target rule refers to an enabled rule, and the target rule further sets a verification manner.
S173, judging whether the SQL field in the information of the target rule is empty or not;
and S174, if the SQL field in the information of the target rule is empty, performing memory check on the main data to obtain a check result.
In this embodiment, the main data may be subjected to rule checking by using an existing memory checking method.
And S175, if the SQL field in the information of the target rule is not empty, performing SQL check on the main data to obtain a check result.
S180, judging whether the verification result is passed or not;
s190, if the verification result is that the verification is passed, performing warehousing operation on the main data;
s200, if the verification result is that the verification is not passed, generating prompt information according to the rule that the main data does not conform to, sending the prompt information to the terminal so that the terminal modifies the main data, and executing the step S170.
And the main data manager modifies the main data according to the prompt information, the rule verification is automatically triggered before the data is put in storage again, and the main data manager needs to modify the main data until the rule verification is passed, so that the main data can be put in the main database.
For example: if a client information table needs to perform main data quality management at present, the client table CUSTOMER, field information, contains a user ID main key, a NAME REAL _ NAME must be filled, an identity CARD number ID _ CARD must be filled and is unique, a mobile PHONE number PHONE is unique, a mailbox MAIL is unique, gender SEX dictionary gender, a BIRTH date BIRTH _ DAY format YYYYYMMDD, and the range interval of the AGE AGE value range is minimum value 1 and maximum value 120.
The dictionary entry needs to be configured first: dictionary name gender, dictionary description gender, dictionary value list: dictionary value 1, dictionary value describing male; dictionary value 2, dictionary value describes female; dictionary value 3, the dictionary value description is unknown. And then, rule configuration is carried out, namely, configuration of a built-in rule is carried out, and the identity card built-in rule is as follows: rule name: an identity card; the rules describe: the identity card number of the Chinese citizen; whether the rule enables identification: is that; the configuration mode is as follows: a regular expression; content of regular expression: [1-9] \ d {7} ((0 \/d) | (1 2-0) }) ([ 0 gaming 1 gaming 2] \/d) |3[0-1 ]) \/d {3} $ | [1-9] \/d {5} [1-9] \/d {3} ((0 \/d) | (1 [0-2 ])) ([ 0 \/1 \/d) |3 ], [0-1 ]) \/d {3} ([ 0-9] | X). Configuring a mailbox built-in rule: rule name: a mailbox; the rules describe: a mailbox address; whether the rule enables identification: is as follows; the configuration mode is as follows: a regular expression; content of regular expression: a [ [ a-zA-Z0-9_ - ] + @ [ a-zA-Z0-9- ] + (\\[ a-zA-Z0-9- ] +) ] \ a-zA-Z0-9] {2,6} $. Configuring a built-in rule of the mobile phone number: rule name: a mobile phone number; the rules describe: a mobile phone number; whether the rule enables identification: is that; the configuration mode is as follows: a regular expression; contents of the regular expression: OBJ # ",' 1, [: digit: ] {9} $, 3,4,5,6.7,8, 9. Secondly, automatically creating rules, configuring automatically created conditions: whether the primary key is enabled: is that; only if enabled: is that; whether the external health is started or not: is that; whether the length is enabled: is that; whether the data format is enabled: is that; whether the dictionary is enabled: is that; whether or not padding is enabled: is as follows; whether the value range is enabled: is. Then configuring main data model information and configuring a model English name CUSTOMER; description of the model: a customer information table; model data type: main data; and (4) remarking a model: all customer information of the company is stored. Configuring basic information of a main data model field, wherein the English name of the field is as follows: ID; and field annotation: a user ID; the field type: NUMBER; field length: 20; field precision: 0; whether the field is primary key: is as follows; whether the field is unique: is as follows; whether the field is self-increasing: is that; associating the main data: NULL; associated main data field: NULL; whether the field must be filled: is as follows; field dictionary information: NULL; field data type: a numerical value; the field value range is as follows: selectable value range function: NULL, minimum of field span: NULL and maximum: NULL; field data format: NULL. The English name of the field: REAL _ NAME; and field annotation: a name; the field type: VARCHAR2; field length: 100; field precision: NULL; whether the field is primary key: if not; whether the field is unique: if not; whether the field is self-increasing: if not; associating the main data: NULL; associated main data field: NULL; whether or not the field must be filled: is that; field dictionary information: NULL; field data type: a character string; the field value range is as follows: selectable value range function: NULL, minimum of field span: NULL and maximum: NULL; field data format: NULL. English name of field: ID _ CARD; and (3) field annotation: an identification number; the field type: VARCHAR2; field length: 18; field precision: NULL; whether the field is primary key: if not; whether the field is unique: is that; whether the field is self-increasing: if not; and (3) associating main data: NULL; associating the main data field: NULL; whether or not the field must be filled: is that; field dictionary information: NULL; field data type: a character string; field value range: selectable value range function: NULL, minimum of field span: NULL and maximum value: NULL; field data format: NULL. The English name of the field: a PHONE; and (3) field annotation: a mobile phone number; the field type: VARCHAR2; field length: 11; field precision: NULL; whether the field is primary key: if not; whether the field is unique: is that; whether the field is self-increasing: if not; associating the main data: NULL; associating the main data field: NULL; whether the field must be filled: is that; field dictionary information: NULL; field data type: a character string; field value range: selectable value range function: NULL, minimum of field span: NULL and maximum value: NULL; field data format: NULL. English name of field: a MAIL; and (3) field annotation: a mailbox; the field type: VARCHAR2; field length: 100, respectively; field precision: NULL; whether the field is primary key: if not; whether the field is unique: is that; whether the field is self-increasing: if not; associating the main data: NULL; associating the main data field: NULL; whether the field must be filled: if not; field dictionary information: NULL; field data type: a character string; field value range: selectable value range function: NULL, minimum of field span: NULL and maximum: NULL; field data format: NULL. English name of field: SEX; and field annotation: sex; the field type: NUMBER; field length: 1; field precision: NULL; whether the field is primary key: if not; whether the field is unique: if not; whether the field is self-increasing: if not; associating the main data: NULL; associated main data field: NULL; whether or not the field must be filled: is as follows; field dictionary information: sex; field data type: a numerical value; the field value range is as follows: selectable value range function: NULL, minimum of field span: NULL and maximum: NULL; field data format: NULL. The English name of the field: BITTH _ DAY; and field annotation: the date of birth; the field type: VARCHAR2; field length: 50; field precision: NULL; whether the field is primary key: if not; whether the field is unique: if not; whether the field is self-increasing: if not; and (3) associating main data: NULL; associating the main data field: NULL; whether the field must be filled: if not; field dictionary information: NULL; field data type: a date; the field value range is as follows: selectable value range function: NULL, minimum of field span: NULL and maximum value: NULL; field data format: YYYYMMDD. English name of field: AGE; and (3) field annotation: age; the field type: NUMBER; field length: 10; field precision: 0; whether the field is primary key: if not; whether the field is unique: if not; whether the field is self-increasing: if not; associating the main data: NULL; associated main data field: NULL; whether the field must be filled: is as follows; field dictionary information: NULL; field data type: a numerical value; field value range: range function: range interval, minimum of field value range: 1 and maximum value: 120 of a solvent; field data format: NULL.
After the basic information configuration of the main data model field is completed, the built-in rules can be automatically matched, and the newly-established field rules are established, so that the rules of the main data model field are configured. Matching of built-in rules: the field comments, if they contain a rule name or rule description, may be matched to the built-in rule. If the condition that the new rule can be created is configured to be yes and the field is also configured with the corresponding condition, the conditional rule is created for the field. The primary key and the unique repeat, if both are configured, only the unique rule needs to be configured.
Thus configured in accordance with the above steps, the following rule is generated and whether the rule is enabled is yes.
ID field: must fill in the rule, only rule; the unique rule, RULESQL: select count (1) from CUSTOMER where ID = 'value'; the bound filling rule: before insertion, whether the data is empty is judged. REAL _ NAME field: a certain filling rule, a length less than or equal to 100 rules; the bound filling rule: judging whether the data is empty before inserting; rule of length less than or equal to 100: the length is judged to be less than or equal to 100 before insertion. ID _ CARD field: a must fill rule, a length less than or equal to 18 rule, an identity card rule and a unique rule; the bound rule: judging whether the data is empty before inserting; rule of length 18 or less: judging the length to be less than or equal to 18 before inserting; and (3) identity card regulation: checking by using a regular expression of an identity card rule; unique rule rulestql: select count (1) from CUSTOMER where ID = 'value'. PHONE field: a must fill rule, a unique rule, a mobile phone number rule; the bound rule: judging whether the data is empty before inserting; the unique rule, RULESQL: select count (1) from CUSTOMER where PHONE = 'value'; mobile phone number rule: and checking by using a regular expression of the mobile phone number rule. MAIL field: the unique rule, the mailbox rule and the field length are less than 100 rules; unique rule rulestql: select count (1) from CUSTOMER where MAIL = 'value'; e, mailbox regulation: checking by using a regular expression of a mailbox rule; length no greater than 18 rule: the length is judged to be 18 or less before insertion. SEX field: filling rules, rules with length less than or equal to 1, and gender dictionary rules: the bound rule: judging whether the data is empty before inserting; rule of length less than or equal to 1: judging that the length is less than or equal to 1 before insertion; gender dictionary rules: and judging whether the data meets the value range defined by the field before inserting. A BIRTH _ DAY field: the length is less than or equal to 50 rules and the data format YYYMMDD rule; length equal to or less than 50 rule: judging the length to be less than or equal to 50 before inserting; data format yyymmdd rule: and judging whether the data meet the format of YYYYMMDD before inserting. AGE field: rules must be filled, rules with length less than or equal to 10 and rules with value range greater than or equal to 1 and less than or equal to 120: the bound filling rule: judging whether the data is empty before inserting; length no greater than 10 rules: judging that the length is less than or equal to 10 before insertion; rule that the value range is greater than or equal to 1 and less than or equal to 120: the pre-insertion judgment data is equal to or more than 1 and equal to or less than 120.
And configuring a rule of the main data model field, configuring a built-in rule of the automatically matched field, and configuring whether the rule is started or not. And configuring an automatically newly-built field rule and configuring whether the rule is started or not. And whether the configured AGE field length is less than or equal to 10 rules is started is not. The field rules are manually configured. There is no need to manually configure field rules on the business.
When a main data manager adds a new piece of main data. ID of 1, REAL _NAMEof Zhang three, ID _ CARD of 330xxxxxxxxxxxxxxx, PHONE of 13xxxxxxxxx, MAIL of a @ xxx, SEX of 1, BITTH _DAYof 1xx9 12xx, AGE of 18. At the moment, rule verification is automatically started, a program can acquire all rules under a CUSTOMER model, whether the rules are started or not is filtered, and all rule lists are left to be verified one by one. A filling rule and a unique rule are arranged under the ID, so that whether the data of the ID field is empty or not is checked, the ID is 1 and is not empty, and the data verified by the ID filling rule is qualified; then checking that the ID is unique, and removing main data to execute select count (1) from CUSTOMER where ID _ CARD =1; and returning 0 to the database, wherein the database does not have data with the ID of 1, and the data which is verified by the ID filling rule is qualified. And continuously checking other rules under the model to judge whether the data is valid. All rules check through, and the data enters the master database.
When the master data manager adds another piece of master data. ID 2, REAL _NAMELi four, ID _ CARD 330xxxxxxxxxxxxxxxxx, PHONE 133xxxxxxxxxx, MAIL abc @ xxx, com, SEX 1, BITTH _DAY1xx9 2xx, AGE 20. At the moment, rule verification is automatically started, a program can acquire all rules under a CUSTOMER model, whether the rules are started or not is filtered, and all rule lists are left to be verified one by one. A bound filling rule and a unique rule are arranged under the ID, so that whether the data of the ID field is empty or not is checked, the ID is not empty if the data of the ID field is empty, and the data compliance is checked by the ID bound filling rule; then, the ID is checked to be unique, and the main data is subjected to select count (1) from CUSTOMER where ID _ CARD =1; and returning 0 to the database, wherein the database does not have data with the ID of 1, and the data is in compliance after the data is verified by the ID filling rule. And (5) continuously checking the rule of the identity card: the filling rule ID _ CARD data is 330xxxxxxxxxxxxxxx and is not empty, and the data is in compliance; length is less than or equal to 18, data compliance; checking by using a regular expression of the identity card rule, and conforming the data; unique rule rulestql: select count (1) from CUSTOMER where ID _ CARD = '330xxxxxxxxxxxxxxx', return 1, data already exists in the database, not compliant. And packaging the unique rule of the error rule information identification card field and the identification card data, and putting the unique rule and the identification card data into an error information list. And continuously checking other valid rules, if the other rules are checked to be passed, returning an error information list to the main data manager.
And the main data manager modifies the main data according to the error rule and the prompt information of the error data, and triggers the data to be warehoused again after the ID _ CARD is modified to be 3301xxxxxxxxxxxx, at the moment, the ID CARD passes the unique verification, other rules pass the verification, and the main data can be warehoused successfully.
The method of the embodiment automatically configures a large number of rules while configuring the model fields, so that the rule configuration efficiency can be greatly improved, the rule configuration time is reduced, meanwhile, rule information is configured during model configuration, the timeliness, the correctness and the integrity of the rules can be improved, the rule configuration of the main data model can be improved, the flow of main data quality management can be simplified, the quality of main data is effectively improved, the convenience, the timeliness, the correctness and the integrity of the rule configuration can be improved, and the rule configuration time can be greatly shortened.
According to the method for managing the quality of the master data, when the model field is configured, a large number of rules can be automatically configured, rule information can be configured during model configuration, when the master data is newly added or modified, the rules need to be verified, warehousing operation can be performed only when the verification passes, and warehousing can be performed only when the master data meets the rules if the verification does not pass, so that the problems that in the prior art, a large number of rules need to be manually configured, errors and omissions are easy to occur, and a large amount of time needs to be spent are solved.
Fig. 4 is a schematic block diagram of a master data quality management apparatus 300 according to an embodiment of the present invention. As shown in fig. 4, the present invention also provides a master data quality management apparatus 300 corresponding to the above master data quality management method. The master data quality management apparatus 300 includes a unit for performing the above-described master data quality management method, and the apparatus may be configured in a server. Specifically, referring to fig. 4, the master data quality management apparatus 300 includes a first configuration unit 301, a second configuration unit 302, a third configuration unit 303, a fourth configuration unit 304, a fifth configuration unit 305, an acquisition unit 306, a verification unit 307, a judgment unit 308, a warehousing unit 309, and an information generation unit 310.
A first configuration unit 301, configured to configure dictionary entries of data; a second configuration unit 302, configured to configure rules; a third configuration unit 303, configured to configure the master data model information; a fourth configuration unit 304, configured to configure basic information of the primary data model field; a fifth configuration unit 305 for configuring rules of the primary data model fields; an obtaining unit 306, configured to obtain new or modified master data; a checking unit 307, configured to perform rule checking on the main data to obtain a checking result; a judging unit 308, configured to judge whether the verification result is a verification pass; a warehousing unit 309, configured to perform warehousing operation on the master data if the verification result is that the verification passes; an information generating unit 310, configured to generate a prompt message according to a rule that the main data does not conform to if the verification result is that the verification fails, send the prompt message to a terminal, so that the terminal modifies the main data, and perform the rule verification on the main data to obtain a verification result.
In an embodiment, the second configuration unit 302 is configured to configure a rule in a rule-built manner or a rule-newly-built manner, where the rule-built manner is to configure a rule name, a rule description, an identifier for setting whether the rule is enabled, and select a rule configuration manner, where the rule configuration manner includes at least one of a dictionary manner and a regular expression manner; the new rule is established by establishing at least one of a main key, a unique key, an external key, a length, a data format, a dictionary, a required fill and a value range.
In an embodiment, as shown in fig. 5, the verification unit 307 includes a rule obtaining subunit filtering subunit 3071, a field judging subunit 3072, a memory verification subunit 3073, and an SQL verification subunit 3074.
A rule obtaining subunit, configured to obtain a corresponding rule; the filtering subunit is used for filtering the rule and screening out the opened rule to obtain a target rule; a field judgment subunit 3072, configured to judge whether an SQL field in the information of the target rule is empty; the memory checking subunit 3073 is configured to, if the SQL field in the information of the target rule is empty, perform memory checking on the main data to obtain a checking result; the SQL checking subunit 3074 is configured to, if the SQL field in the information of the target rule is not empty, perform SQL checking on the main data to obtain a checking result. .
It should be noted that, as can be clearly understood by those skilled in the art, the specific implementation processes of the main data quality management apparatus 300 and each unit may refer to the corresponding descriptions in the foregoing method embodiments, and for convenience and brevity of description, no further description is provided herein.
The master data quality management apparatus 300 described above may be implemented in the form of a computer program that can be run on a computer device as shown in fig. 6.
Referring to fig. 6, fig. 6 is a schematic block diagram of a computer device according to an embodiment of the present application. The computer device 500 may be a server, wherein the server may be an independent server or a server cluster composed of a plurality of servers.
Referring to fig. 6, the computer device 500 includes a processor 502, memory, and a network interface 505 connected by a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.
The non-volatile storage medium 503 may store an operating system 5031 and a computer program 5032. The computer programs 5032 include program instructions that, when executed, cause the processor 502 to perform a primary data quality management method.
The processor 502 is used to provide computing and control capabilities to support the operation of the overall computer device 500.
The internal memory 504 provides an environment for the execution of the computer program 5032 in the non-volatile storage medium 503, and when the computer program 5032 is executed by the processor 502, the processor 502 may be caused to perform a primary data quality management method.
The network interface 505 is used for network communication with other devices. Those skilled in the art will appreciate that the configuration shown in fig. 6 is a block diagram of only a portion of the configuration associated with the present application and does not constitute a limitation of the computer device 500 to which the present application may be applied, and that a particular computer device 500 may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
Wherein the processor 502 is configured to run the computer program 5032 stored in the memory to implement the following steps:
configuring dictionary items of data; configuring a rule; configuring master data model information; configuring basic information of a main data model field; configuring rules for the primary data model fields; acquiring newly added or modified main data; carrying out rule verification on the main data to obtain a verification result; judging whether the checking result is passed or not; if the verification result is that the verification is passed, performing warehousing operation on the main data; if the verification result is that the verification is not passed, generating prompt information according to the rule that the main data is not in accordance, sending the prompt information to the terminal so that the terminal modifies the main data, and executing the rule verification on the main data to obtain a verification result.
Wherein the dictionary entry includes a dictionary name, a dictionary description, a dictionary value, and a dictionary value description list.
The main data model information comprises a model English name, a model description and a model data type.
The rules of the main data model field comprise the function of whether to start the automatically matched built-in rules of the field and the function of whether to start the newly established field rules.
In an embodiment, when the processor 502 implements the step of configuring the rule, the following steps are specifically implemented:
configuring a rule in a rule built-in mode or a rule newly-established mode, wherein the rule built-in mode is to configure a rule name, rule description, set an identifier for enabling the rule or not and select a rule configuration mode, and the rule configuration mode comprises at least one of a dictionary mode and a regular expression mode; the new rule is established by establishing at least one of a main key, a unique key, an external key, a length, a data format, a dictionary, a required fill and a value range.
In an embodiment, when the processor 502 implements the step of performing the rule check on the main data to obtain the check result, the following steps are specifically implemented:
acquiring a corresponding rule; filtering the rule, and screening out an opened rule to obtain a target rule; judging whether the SQL field in the information of the target rule is empty or not; if the SQL field in the information of the target rule is empty, performing memory verification on the main data to obtain a verification result; and if the SQL field in the information of the target rule is not empty, performing SQL verification on the main data to obtain a verification result.
It should be understood that, in the embodiment of the present Application, the Processor 502 may be a Central Processing Unit (CPU), and the Processor 502 may also be other general-purpose processors, digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field-Programmable Gate arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
It will be understood by those skilled in the art that all or part of the flow of the method implementing the above embodiments may be implemented by a computer program instructing associated hardware. The computer program includes program instructions, and the computer program may be stored in a storage medium, which is a computer-readable storage medium. The program instructions are executed by at least one processor in the computer system to implement the flow steps of the embodiments of the method described above.
Accordingly, the present invention also provides a storage medium. The storage medium may be a computer-readable storage medium. The storage medium stores a computer program, wherein the computer program, when executed by a processor, causes the processor to perform the steps of:
configuring dictionary items of data; configuring a rule; configuring master data model information; configuring basic information of a main data model field; configuring rules of the main data model field; acquiring newly added or modified main data; carrying out rule verification on the main data to obtain a verification result; judging whether the checking result is passed or not; if the verification result is that the verification is passed, performing warehousing operation on the main data; if the verification result is that the verification is not passed, generating prompt information according to the rule that the main data is not in accordance, sending the prompt information to the terminal so that the terminal modifies the main data, and executing the rule verification on the main data to obtain a verification result.
Wherein the dictionary entry includes a dictionary name, a dictionary description, a dictionary value, and a dictionary value description list.
The main data model information comprises a model English name, a model description and a model data type.
And the rules of the main data model field comprise a function of whether to start the automatically matched field built-in rules and a function of whether to start newly-built field rules.
In an embodiment, when the processor executes the computer program to implement the step of configuring the rule, the following steps are specifically implemented:
configuring rules in a built-in rule mode or a newly-established rule mode, wherein the built-in rule mode is to configure rule names, rule description, set identifiers for enabling the rules or not and select rule configuration modes, and the rule configuration modes comprise at least one of a dictionary mode and a regular expression mode; the new rule is established by establishing at least one of a main key, a unique key, an external key, a length, a data format, a dictionary, a required fill and a value range.
In an embodiment, when the processor executes the computer program to implement the step of performing rule verification on the main data to obtain a verification result, the following steps are specifically implemented:
acquiring a corresponding rule;
filtering the rules, and screening out the opened rules to obtain target rules; judging whether the SQL field in the information of the target rule is empty or not; if the SQL field in the information of the target rule is empty, performing memory verification on the main data to obtain a verification result; and if the SQL field in the information of the target rule is not empty, performing SQL verification on the main data to obtain a verification result.
The storage medium may be a usb disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk, which can store various computer readable storage media.
Those of ordinary skill in the art will appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the components and steps of the various examples have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative. For example, the division of each unit is only one logic function division, and there may be another division manner in actual implementation. For example, various elements or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented.
The steps in the method of the embodiment of the invention can be sequentially adjusted, combined and deleted according to actual needs. The units in the device of the embodiment of the invention can be merged, divided and deleted according to actual needs. In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a storage medium. Based on such understanding, the technical solution of the present invention essentially or partly contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a terminal, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention.
While the invention has been described with reference to specific embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. The master data quality management method is characterized by comprising the following steps:
configuring dictionary entries of the data;
configuring a rule;
configuring master data model information;
configuring basic information of a main data model field;
configuring rules of the main data model field;
acquiring newly added or modified main data;
carrying out rule verification on the main data to obtain a verification result;
judging whether the checking result is passed or not;
if the verification result is that the verification is passed, performing warehousing operation on the main data;
if the verification result is that the verification is not passed, generating prompt information according to the rule that the main data is not in accordance, sending the prompt information to the terminal so that the terminal modifies the main data, and executing the rule verification on the main data to obtain a verification result.
2. A method for master data quality management according to claim 1, wherein the dictionary entries comprise dictionary names, dictionary descriptions, dictionary values, and a list of dictionary value descriptions.
3. The primary data quality management method according to claim 1, wherein the configuration rule comprises:
configuring a rule in a rule built-in mode or a rule newly-established mode, wherein the rule built-in mode is to configure a rule name, rule description, set an identifier for enabling the rule or not and select a rule configuration mode, and the rule configuration mode comprises at least one of a dictionary mode and a regular expression mode; the new rule is established by establishing at least one of a main key, a unique key, an external key, a length, a data format, a dictionary, a required fill and a value range.
4. The primary data quality management method according to claim 1, wherein the primary data model information includes a model english name, a model description, and a model data type.
5. The primary data quality management method according to claim 1, wherein the rules of the primary data model field include whether to start a function of automatically matching built-in rules of the field, and whether to start a function of newly creating field rules.
6. The method for quality management of main data according to claim 1, wherein the performing a rule check on the main data to obtain a check result comprises;
acquiring a corresponding rule;
filtering the rule, and screening out an opened rule to obtain a target rule;
judging whether the SQL field in the information of the target rule is empty or not;
if the SQL field in the information of the target rule is empty, performing memory verification on the main data to obtain a verification result;
and if the SQL field in the information of the target rule is not empty, performing SQL verification on the main data to obtain a verification result.
7. A master data quality management apparatus, comprising:
the first configuration unit is used for configuring dictionary items of the data;
a second configuration unit for configuring the rule;
a third configuration unit for configuring the master data model information;
the fourth configuration unit is used for configuring the basic information of the main data model field;
the fifth configuration unit is used for configuring the rules of the main data model field;
an acquisition unit configured to acquire newly added or modified master data;
the verification unit is used for carrying out rule verification on the main data to obtain a verification result;
the judging unit is used for judging whether the checking result is passed through;
the storage unit is used for carrying out storage operation on the main data if the verification result is that the verification is passed;
and the information generating unit is used for generating prompt information according to the rule that the main data does not conform to if the verification result is that the verification is not passed, sending the prompt information to the terminal so that the terminal modifies the main data, and executing the rule verification on the main data to obtain a verification result.
8. The apparatus according to claim 7, wherein the second configuration unit is configured to configure the rule in a rule-embedded manner or a rule-newly-created manner, where the rule-embedded manner is configured by configuring a rule name, a rule description, an identifier indicating whether the rule is enabled, and selecting a rule configuration manner, where the rule configuration manner includes at least one of a dictionary manner and a regular expression manner; the new rule is established by establishing at least one of a main key, a unique key, an external key, a length, a data format, a dictionary, a required fill and a value range.
9. A computer device, characterized in that the computer device comprises a memory, on which a computer program is stored, and a processor, which when executing the computer program implements the method according to any of claims 1 to 7.
10. A storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 7.
CN202210161602.XA 2022-02-22 2022-02-22 Master data quality management method, master data quality management device, computer equipment and storage medium Pending CN115328948A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210161602.XA CN115328948A (en) 2022-02-22 2022-02-22 Master data quality management method, master data quality management device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210161602.XA CN115328948A (en) 2022-02-22 2022-02-22 Master data quality management method, master data quality management device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115328948A true CN115328948A (en) 2022-11-11

Family

ID=83916055

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210161602.XA Pending CN115328948A (en) 2022-02-22 2022-02-22 Master data quality management method, master data quality management device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115328948A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109933578A (en) * 2019-03-21 2019-06-25 浪潮软件集团有限公司 A kind of configurable automated data detection method for quality and system
CN110059078A (en) * 2019-04-19 2019-07-26 中国航空无线电电子研究所 A kind of dynamic configuration and method of calibration of navigational route database customization data
CN111723086A (en) * 2020-07-20 2020-09-29 江苏苏宁银行股份有限公司 Data quality checking method, device and equipment and readable storage medium
CN112395325A (en) * 2020-11-27 2021-02-23 广州光点信息科技有限公司 Data management method, system, terminal equipment and storage medium
CN113127458A (en) * 2019-12-30 2021-07-16 北京奇虎科技有限公司 Data quality auditing method and device, electronic equipment and storage medium
CN113127455A (en) * 2019-12-30 2021-07-16 北京奇虎科技有限公司 Data management method and device, electronic equipment and readable storage medium
US11106643B1 (en) * 2017-08-02 2021-08-31 Synchrony Bank System and method for integrating systems to implement data quality processing

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11106643B1 (en) * 2017-08-02 2021-08-31 Synchrony Bank System and method for integrating systems to implement data quality processing
CN109933578A (en) * 2019-03-21 2019-06-25 浪潮软件集团有限公司 A kind of configurable automated data detection method for quality and system
CN110059078A (en) * 2019-04-19 2019-07-26 中国航空无线电电子研究所 A kind of dynamic configuration and method of calibration of navigational route database customization data
CN113127458A (en) * 2019-12-30 2021-07-16 北京奇虎科技有限公司 Data quality auditing method and device, electronic equipment and storage medium
CN113127455A (en) * 2019-12-30 2021-07-16 北京奇虎科技有限公司 Data management method and device, electronic equipment and readable storage medium
CN111723086A (en) * 2020-07-20 2020-09-29 江苏苏宁银行股份有限公司 Data quality checking method, device and equipment and readable storage medium
CN112395325A (en) * 2020-11-27 2021-02-23 广州光点信息科技有限公司 Data management method, system, terminal equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109344642B (en) Interface rule checking method, device, computer equipment and storage medium
WO2020000706A1 (en) Database comparison-based interface testing method and system, device and storage medium
CN112148509A (en) Data processing method, device, server and computer readable storage medium
US20190147029A1 (en) Method and system for generating conversational user interface
CN110311891B (en) Account management method and device, computer equipment and storage medium
US20170316159A1 (en) System And Method For Updating Customer Data
CN112561485B (en) Warranty product rule determination method and device, computer equipment and storage medium
CN110309099A (en) Interface managerial method, device, equipment and computer readable storage medium
CN109448811B (en) Prescription auditing improvement method and device, electronic equipment and storage medium
CN110427188A (en) It is single to survey configuration method, device, equipment and the storage medium for asserting program
CN107766313B (en) A kind of introduction method and its terminal of data list
CN108415998A (en) Using dependence update method, terminal, equipment and storage medium
CN114116801A (en) Data list checking method and device
CN114003432A (en) Parameter checking method and device, computer equipment and storage medium
US20210326368A1 (en) Workflow-based dynamic data model and application generation
CN109840078B (en) Method and device for collaboratively editing hierarchical metadata
CN115328948A (en) Master data quality management method, master data quality management device, computer equipment and storage medium
CN115544132A (en) Data import method and system and electronic equipment
CN110209442A (en) Plug-in unit function automatic execution method, electronic equipment, storage medium
US20160267172A1 (en) Constraint free model
CN111737148A (en) Automatic regression testing method and device, computer equipment and storage medium
CN113568682B (en) Rule data verification method, device, computer equipment and storage medium
CN115114052B (en) Method and device for intelligently providing database microservice
CN109285599A (en) Failure parameter is provided in medicine cloud infrastructure
CN117149631A (en) Parameter verification method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 103-27, Building 19, No. 1399, Liangmu Road, Cangqian Street, Yuhang District, Hangzhou City, Zhejiang Province, 311121

Applicant after: Hangzhou Meichuang Technology Co.,Ltd.

Address before: 310011 room 1201, building 7, Tianxing International Center, No. 508, Fengtan Road, Gongshu District, Hangzhou, Zhejiang Province

Applicant before: HANGZHOU MEICHUANG TECHNOLOGY CO.,LTD.