WO2021184995A1 - 数据处理方法及数据标准管理系统 - Google Patents

数据处理方法及数据标准管理系统 Download PDF

Info

Publication number
WO2021184995A1
WO2021184995A1 PCT/CN2021/075477 CN2021075477W WO2021184995A1 WO 2021184995 A1 WO2021184995 A1 WO 2021184995A1 CN 2021075477 W CN2021075477 W CN 2021075477W WO 2021184995 A1 WO2021184995 A1 WO 2021184995A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
information
data element
standard
database table
Prior art date
Application number
PCT/CN2021/075477
Other languages
English (en)
French (fr)
Inventor
柴永明
宋国英
崔静
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021184995A1 publication Critical patent/WO2021184995A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation

Definitions

  • This application relates to the field of data processing, in particular to a data processing method and a data standard management system.
  • Data standards are data meanings and business rules that a specified group (such as a certain industry or a certain organization) needs to abide by.
  • Data standards are usually expressed in the form of a table structure (also known as a database table structure), and the table structure includes multiple business fields.
  • Each business field in the table structure is called a data element (also called a data element), which is a basic unit of data standards.
  • the information of the data element is in compliance with the business rules, and the information includes the name, definition, structure, and value rules.
  • the data standard management system stores information about data standards that one or more industries need to comply with.
  • the data standards can include national standards (referred to as national standards) and industry standards. (Referred to as the industry standard) and/or local standards (referred to as the landmark).
  • the data standard management system can audit the information of the database table structure maintained in the business system; if the database table structure does not conform to the target data stored in the data standard management system corresponding to the business system According to the requirements of the standard, the data standard management system will establish the mapping relationship between the database table structure and the target data standard (that is, the mapping relationship between the data elements of the database table structure and the data elements of the target data standard), and provide data services every time in the business system At the time, the data standard management system converts data that meets the conditions defined by the database table structure into data that meets the target data standard based on the mapping relationship, and then outputs the converted data.
  • the mapping relationship between the database table structure and the target data standard that is, the mapping relationship between the data elements of the database table structure and the data elements of the target data standard
  • the embodiment of the present application provides a data processing method and a data standard management system.
  • the technical solution is as follows:
  • a data processing method including:
  • the business data processing method provided by the embodiment of the present application verifies the information of the database table structure of the business system before the business system goes online, so as to ensure that the business system can adopt an accurate target database table structure after the business system goes online.
  • the target database table structure has higher reliability, which improves the quality of data provided after the business system goes online, reduces the probability of data conversion after the business system goes online, and reduces the cost of subsequent data governance.
  • the data standard management system may obtain information of the first database table structure of the first business system to be launched in various ways.
  • the embodiment of the present application uses the following two methods as examples for illustration:
  • the information of the table structure of the first database is acquired through online acquisition.
  • the information of the first database table structure and the information of the standard data element both include data element identification and data element description information.
  • the obtaining of the information of the first database table structure of the first business system to be launched includes:
  • Receive the data element identifier of the first data element where the data element identifier of the first data element is one of the data element identifiers of a plurality of standard data elements stored in the data standard library; obtain the first data element in the data standard library
  • the data element identification of the data element corresponds to the description information of the data element.
  • the data standard management system supports a keyword search function.
  • the data element identifier of the first data element is obtained by searching among the data element identifiers of multiple standard data elements stored in the data standard library through the keyword corresponding to the first data element.
  • the data standard management system also supports a search prompt function to ensure that business personnel effectively determine the data element identifier of the first data element.
  • the method before the receiving the data element identifier of the first data element, the method further includes: matching the received keyword with data element identifiers of multiple standard data elements stored in the data standard library; The matching result is output, and the matching result includes the information of the at least one second standard data element, and the data element identifier of each second standard data element in the at least one second standard data element matches the keyword.
  • the aforementioned algorithm for matching the received keyword with the data element identifiers of multiple standard data elements may be a fuzzy matching algorithm, where the fuzzy matching algorithm refers to the given conditions or requirements. A certain degree of accuracy matching.
  • the principle of fuzzy matching is to search for the exact same content as the searched content first, and then search for the content that is close to the searched content.
  • the fuzzy matching algorithm also allows the partial literal order of the search keywords to be reversed or spaced. Search content can include keywords' synonyms, synonyms, related words, and phrases containing keywords, etc.
  • the matching result obtained by using the fuzzy matching algorithm can include both the exact matching result and the result other than the exact matching.
  • the matching result of the fuzzy matching The content of the second standard data element is more extensive, and the number of information of the second standard data element obtained is large, thereby improving the reference of the matching result to the business personnel.
  • the exact matching algorithm means that the matching condition is to determine the match when the search keyword and the data element identifier of the standard data element are exactly the same, and the matching restriction is precise and strict.
  • the information of the multiple second standard data elements may be arranged in a first specified order.
  • the first specified order can be implemented by the following two exemplary implementation manners:
  • the information of the multiple second standard data elements is sorted in descending order according to the matching degree between the data element identifier of the second standard data element and the keyword (that is, sorted in descending order of the matching degree) .
  • the matching degree P1 satisfies the first matching degree calculation formula:
  • M is the number of characters of the data element identifier of the second standard data element that is the same as the keyword, which is equivalent to the number of characters corresponding to the intersection of the data element identifier of the second standard data element and the keyword;
  • N is the second standard data The maximum number of characters in the number of characters of the meta data element identification and the number of characters of the keyword.
  • the information of the multiple second standard data elements is sorted in descending order of the priority of the data standard to which the second standard data element belongs (that is, sorted in descending order of priority).
  • the priority of the data standard may include standard priority or time priority.
  • the standard priority refers to the priority of the standard itself.
  • the order of the standard priority from high to low is national standard, line standard and landmark.
  • Time priority is usually that the closer the release time or implementation time is to the current one, the higher the priority.
  • the first specified order may also have other manners.
  • the first specified order is an order determined by combining the foregoing first and second exemplary implementation manners. That is, the information of the multiple second standard data elements is sorted according to the degree of matching between the data element identifier of the second standard data element and the keyword and the priority of the data standard to which the second standard data element belongs. For example, for each second standard data element, the data standard management system can obtain the degree of matching between the data element identifier of the second standard data element and the keyword, and obtain the priority of the data standard to which the second standard data element belongs.
  • the data standard management system sorts the information of each second standard data element according to the sort indication value corresponding to each second standard data element. It is usually sorted in descending order according to the sorting indicator value.
  • the information of the table structure of the first database is obtained by receiving an offline edited data standard document.
  • the obtaining information about the structure of the first database table of the first business system to be launched includes:
  • a data standard document is received, where the data standard document includes information on the table structure of the first database.
  • the third-party modeling tool can access (for example, query) the data standard library, obtain the information of the standard data elements stored in the data standard library, and generate data standard documents based on the information of the standard data elements.
  • the process of receiving the data standard document includes: receiving the data standard document generated by a third-party modeling tool based on the data standard library.
  • the obtaining information about the structure of the first database table of the first business system to be launched includes:
  • Output a data dictionary template, which is a reference template for the information of the first database table structure
  • the data standard management system outputs data dictionary templates for reference by business personnel, so that business personnel no longer rely solely on their own experience to formulate data element information, but formulate data element information on a basis, so as to improve the acquisition
  • the accuracy of the information in the first database table structure reduces the complexity and computational cost of the subsequent verification process.
  • the verification of the information of the first database table structure may include at least two optional methods of the following data standard symbolic verification and data standard normative verification:
  • the first optional method is data standard symbolic verification.
  • the verification process refers to verifying the information of at least one data element based on the information of the standard data element in the data standard library.
  • the information of the standard data element is the aforementioned "symbol”.
  • the process of verifying the information of the first database table structure based on the data standard library includes:
  • the first modification prompt information When the information of the first data element does not match the information of the multiple standard data elements, the first modification prompt information is sent.
  • the first modification prompt information indicates to update the information of the first data element, and the first data element is One of the at least one data element; after receiving updated information of the first data element that matches the information of any one of the multiple standard data elements, determining the first data element The information verification was successful.
  • Business personnel can use the first modification prompt information sent multiple times by the data standard management system to achieve multiple modifications of the information of the first data element to meet the requirements of the standard data element in the data standard library, so that the business personnel can define the data
  • the information of the standard data element of the standard library is consistent with the information of the data element.
  • the information of the first database table structure and the information of the standard data element both include data element identifiers
  • the first modification prompt information includes the information of at least one first standard data element
  • the at least one first standard data element The data element identifier of each first standard data element in the standard data element is fuzzy matching with the data element identifier of the first data element.
  • the fuzzy matching algorithm is a search algorithm in ElasticSearch.
  • the matching result obtained by using the fuzzy matching algorithm can include both the exact matching result and the result other than the exact matching.
  • the matching result of the fuzzy matching The content of the data element is more extensive, and the information of the first standard data element obtained is large, thereby improving the reference of the matching result to the business personnel.
  • the information of the multiple first standard data elements may be arranged in a second specified order.
  • the second specified order can be implemented in the following two exemplary implementation manners:
  • the information of the multiple first standard data elements is sorted in descending order according to the matching degree between the data element identifier of the first standard data element and the keyword (that is, sorted in descending order of the matching degree) .
  • the information of the multiple first standard data elements is sorted in descending order of the priority of the data standard to which the first standard data element belongs (that is, sorted in descending order of priority).
  • the second specified order may also have other manners.
  • the second specified order is an order determined by combining the foregoing first and second exemplary implementation manners. That is, the information of the multiple first standard data elements is sorted according to the degree of matching between the data element identifier of the first standard data element and the data element identifier of the first data element and the priority of the data standard to which the first standard data element belongs.
  • the data standard management system can obtain the degree of matching between the data element identifier of the first standard data element and the data element identifier of the first data element, and obtain the degree to which the first standard data element belongs The priority of the data standard, and assign a value according to the specified rule to the priority of the data standard to which the first standard data element belongs, where the priority is positively related to the assigned value; then, the matching degree and priority are assigned respectively in advance
  • the ordering indication value of the first standard data element is determined by means of weighted summation.
  • the data standard management system sorts the information of each first standard data element according to the sort indication value corresponding to each first standard data element. It is usually sorted in descending order according to the sorting indicator value.
  • the method further includes: receiving updated information about the structure of the first database table;
  • the information of the first data element is determined from the information of all data elements in the information of a database table structure.
  • the second option is to check the data standard normatively.
  • the verification process refers to the verification of the normativity of the data standard information. Mainly verify the format of the information of the data standard.
  • the data-based standard library to verify the information of the first database table structure includes:
  • a second modification prompt message is sent, and the second modification prompt information indicates the format of the information to update the first database table structure; upon receiving the format conforms to the format After the required updated information of the first database table structure, it is determined that the format verification of the information of the first database table structure is successful.
  • the business personnel can use the second modification prompt message sent multiple times by the data standard management system to realize the multiple modification of the information format of the first database table structure to meet the requirements of the data standard management system for the standard information format, so that the business personnel You can define the information of the database table structure required by the compound.
  • the method further includes:
  • the value corresponding to the second data element is an enumerable value
  • add data element remark information for the second data element and the data element remark information is used to identify the enumerable value corresponding to the second data element.
  • the data element is one of the at least one data element.
  • the target database table structure determined based on the first database table structure may still include adding data element remark information for the second data element.
  • the first business system is online, if you need to collect the data corresponding to the second data element, you can directly collect the data in the enumerable value format corresponding to the second data element to ensure that the final collected data conforms to the format of the data standard library Requirements, that is, compliance with relevant standards.
  • the method further includes:
  • the data standard library operation request includes a standard data element addition request, a standard data element update request, a standard data element deletion request, or a standard data element query request; after the data standard database operation request is successfully authenticated Execute the operation corresponding to the data standard library operation request on the data standard library.
  • the first authentication method authentication for data operations with a high confidentiality level.
  • the data standard management system detects whether the account carried in the data standard request is a first-level account, and the first level is greater than Specify the level threshold.
  • the first level account is the account of the system administrator.
  • the data standard management system When the data standard management system detects that the account carried in the data standard request is not a first-level account, it determines that the authentication of the data standard database operation request fails.
  • the data standard management system determines that the authentication of the data standard library operation request is successful; in another In an optional manner, the data standard management system sends the data standard library operation request to the terminal device corresponding to the second-level account, and after receiving the permission instruction indicating that the operation of the data standard library is allowed, determines the operation of the data standard library The request authentication is successful; after receiving the prohibition instruction indicating that the operation of the data standard library is not allowed, it is determined that the authentication of the operation request of the data standard library has failed.
  • the second level is higher than or equal to the first level, and the account of the second level is different from the account of the first level.
  • the second-level account is the account of the project administrator
  • the terminal device corresponding to the corresponding second-level account is the aforementioned first terminal device.
  • the project administrator determines whether the corresponding person is allowed to operate the data standard library according to the content of the request and the account number carried in the request. If the data standard library is allowed to operate, Then, the permission instruction is sent through the first terminal device, and if the operation of the data standard library is not allowed, the prohibition instruction is sent through the first terminal device.
  • the second authentication method authentication of data operations with low confidentiality levels.
  • the data standard management system detects whether the account number carried in the data standard request is an account assigned by the data standard management system, that is, a legal account in the data standard management system, for example,
  • the account assigned by the data standard management system is any one of the accounts of project administrators, business personnel, and system administrators.
  • the data standard management system When the data standard management system detects that the account carried in the data standard request is not an account assigned by the data standard management system, it determines that the authentication of the data standard library operation request fails.
  • the data standard management system When the data standard management system detects that the account number carried in the data standard request is an account assigned by the data standard management system, in an optional manner, the data standard management system determines that the authentication of the data standard library operation request is successful; In an optional manner, the data standard management system sends the data standard library operation request to the terminal device corresponding to the third-level account, and after receiving the permission instruction indicating that the operation of the data standard library is allowed, determines the data standard The library operation request is successfully authenticated; after receiving a prohibition instruction indicating that the operation of the data standard library is not allowed, it is determined that the authentication of the data standard library operation request has failed.
  • the third level is higher than or equal to the level of the account carried in the aforementioned data standard request, and the third level of account is different from the account carried in the data standard request.
  • the third-level account is the account of the project administrator or system administrator.
  • the terminal device corresponding to the corresponding third-level account is the aforementioned third terminal device .
  • the system administrator determines whether the corresponding person is allowed to operate the data standard library according to the content of the request and the account number carried in the request. If the operation of the data standard library is allowed, Then, the permission instruction is sent through the third terminal device, and if the operation of the data standard library is not allowed, the prohibition instruction is sent through the third terminal device.
  • the standard management system can also send a data operation response, indicating completion of the operation corresponding to the data standard library operation request performed on the data standard library, or indicating the success of the operation.
  • the standard management system may also send a data operation response, indicating that the operation corresponding to the data standard library operation request is prohibited from being performed on the data standard library, or indicating that the operation fails.
  • the method further includes: querying an operation log corresponding to the data standard library; when the operation log includes an abnormal operation log, issuing an abnormal alarm.
  • System administrators can determine whether data operations need to be backtracked based on abnormal alarms, and realize data rollback, thereby effectively maintaining the data standard management system.
  • the data standard management system also supports data governance functions.
  • the data standard management system is pre-established with an artificial intelligence model, and data governance is performed through the artificial intelligence model.
  • the data governance process can include the following steps:
  • the information of the second database table structure includes the information of at least one data element; when the information of the third data element is detected based on the artificial intelligence model and the multiple When the information of the standard data element does not match, the target standard data element corresponding to the third data element is determined among the multiple standard data elements, and the third data element is at least one of the information of the second database table structure.
  • a data element of a data element; based on the artificial intelligence model, a mapping relationship between the third data element and the target standard data element is established.
  • the data standard management system supports script output. After obtaining the target database table structure of the first business system, the method further includes: outputting a script corresponding to the target database table structure.
  • the script is used to generate information about the target database table structure, which includes the database table building statement of the first business system.
  • the business person can receive the script through the second terminal device. After the first business system is online, the business personnel can load and run the script in the first business system.
  • the script after running can generate the information of the target database table structure and obtain the corresponding information according to the information construction of the target database table structure. Database table structure. In this way, there is no need for business personnel to write scripts by themselves, reducing the workload of business personnel, thereby saving labor costs.
  • a data standard management system may include at least one module, and the at least one module may be used to implement the data processing method provided by the first aspect or various possible implementations of the first aspect. .
  • the present application provides a computing device, which includes a processor and a memory.
  • the memory stores computer instructions; the processor executes the computer instructions stored in the memory, so that the computing device executes the methods provided by the foregoing first aspect or various possible implementations of the first aspect, so that the computing device deploys the foregoing second aspect or the first aspect.
  • the data standard management system provided by the various possible realizations of the two aspects.
  • the present application provides a computer-readable storage medium in which computer instructions are stored, and the computer instructions instruct the computing device to execute the foregoing first aspect or various possible implementations of the first aspect.
  • the method or the computer instruction instructs the computing device to deploy the data standard management system provided by the foregoing second aspect or various possible implementations of the second aspect.
  • the present application provides a computer program product.
  • the computer program product includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the processor of the computing device can read the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computing device executes the foregoing first aspect or the method provided by various possible implementations of the first aspect, so that the calculation
  • the equipment deploys the data standard management system provided by the foregoing second aspect or various possible implementations of the second aspect.
  • a chip is provided.
  • the chip may include a programmable logic circuit and/or program instructions. When the chip is running, it is used to implement the data processing method as described in the first aspect.
  • FIG. 1 is a schematic diagram of an application environment of a data standard management system involved in a data processing method provided by an embodiment of the present application;
  • FIG. 2 is a schematic diagram of an application environment of a data standard management system involved in another data processing method provided by an embodiment of the present application;
  • FIG. 3 is a schematic flowchart of a data processing method provided by an exemplary embodiment of the present application.
  • FIG. 4 is a schematic diagram of a schematic interface provided by the data standard management system in an embodiment of the present application.
  • FIG. 5 is another schematic diagram of an interface provided by the data standard management system in an embodiment of the present application.
  • Fig. 6 is a schematic structural diagram of a data standard management system provided by an embodiment of the present application.
  • Fig. 7 is a schematic structural diagram of another data standard management system provided by an embodiment of the present application.
  • Fig. 8 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • Data standards are data meanings and business rules that a designated group (such as a certain industry or a certain organization) needs to abide by.
  • Data standards are usually expressed in the form of a table structure, which includes multiple business fields. Since the data standard corresponding table structure is usually stored in a database, the data standard corresponding table structure is also called a database table structure, and the table defined by the database table structure is a database table (also called a physical table or a data entity).
  • Data element refers to a business field in the database table structure, and is the basic unit of the database table structure.
  • the information of a data element includes information used to describe the attributes of the data element. That is, the attribute information of a data element.
  • the attribute information can include information used to describe the parameters of the data element's own attributes, such as the name, definition, structure, and value rules and other parameter information, and can also include information used to describe the parameters of the environment to which the data element belongs, such as belonging The name of the database table structure.
  • the data element in the data standard is called the standard data element.
  • Code set refers to the definition of enumerable values in the database table structure.
  • Enumerable value refers to the existence of multiple instances of the value.
  • the value corresponding to a data element is an enumerable value, which means that there are multiple values in the data element, but the number of values is limited.
  • the code set refers to when the value corresponding to the data element is an enumerable value. Definition of enumerable values.
  • the name of a data element is gender.
  • the data element content of the data element has two enumerable values male and female.
  • the code set defines the two enumerable values to be represented by 0 and 1, respectively.
  • the implementation of a certain project involves one or more business systems.
  • the one or more business systems need to follow the same data standards, such as national standards, industry standards, or landmarks. In this way, the operating efficiency and data quality of the project can be improved, and operating costs can be reduced.
  • the data standard management system involved in the data processing method can realize the consistency of the data standard adopted by the business systems maintained by the multiple manufacturers (that is, the database table structure adopted by each manufacturer meets the requirements of the data standard).
  • the administrator of the enterprise is referred to as the project administrator
  • the management personnel of the manufacturer are referred to as business personnel
  • the staff who maintains the data standard management system are referred to as system administrators.
  • the aforementioned organization X may be a communication organization, a power organization, a water conservancy organization, or an agricultural organization, etc.
  • the business system may be a communication business system, a power business system, a water conservancy business system, or an agricultural business system, etc.
  • FIG. 1 is a schematic diagram of an application environment of a data standard management system involved in a data processing method provided by an embodiment of the present application. Please refer to Figure 1.
  • the data standard management system 10 includes:
  • the data standard library 101 and the data standard management device 102 establish a wired or wireless communication connection.
  • the data standard library 101 may be a server or a server cluster composed of multiple servers.
  • the data standard library 101 is used to store information of multiple standard data elements.
  • Standard data elements refer to data elements in established data standards (such as established national standards, industry standards, or landmarks).
  • the data standard management device 102 may be a server or a server cluster composed of multiple servers or other computer devices.
  • the data standard management device 102 is used to manage part or all of the functions of one or more business systems, and verify the information of the database table structure of the managed business system through standard data elements stored in the data standard library 101.
  • the data standard management system 10 may further include:
  • Life cycle management equipment 103 and data management equipment 104 have established wired or wireless communication connections with the data standard library 101 respectively.
  • the life cycle management device 103 may be one server or a server cluster composed of multiple servers.
  • the life cycle management device 103 is used for managing operations on the data standard library 101, such as adding operations, updating operations, deleting operations, or query operations of standard data elements.
  • the data management device 104 may be a server or a server cluster composed of multiple servers or a cloud computing center.
  • the data management device 104 is used for data management of the data standard library, such as establishing a mapping relationship between data elements in a database table structure and standard data elements in the data standard library 101.
  • the aforementioned data standard library 101, data standard management device 102, life cycle management device 103, and data can be implemented by the corresponding module of a device, that is, the function integration of at least two devices in the data standard library 101, the data standard management device 102, the life cycle management device 103, and the data management device 104 On one device.
  • the device may include a data standard storage module, a data standard management module, a life cycle management module, and a data management module, which respectively implement the aforementioned data standard library 101, data standard management equipment 102, life cycle management equipment 103, and data management equipment 104. Function.
  • the data standard management system also supports task management functions for managing various business systems and assigning management authority to the business personnel of each business system.
  • Each business person can only manage the database etc. related to the business system assigned to him.
  • the data standard management system 10 may further include a first terminal device, a second terminal device, and a third terminal device corresponding to the project manager, the business personnel, and the system manager, respectively. Project administrators, business personnel, and system administrators can access the data standard management system through corresponding terminal devices.
  • FIG. 3 is a schematic flowchart of a data processing method provided by an exemplary embodiment of the present application.
  • the data processing method may be applied to the data standard management system shown in FIG. 1 or FIG. 2. Subsequent embodiments assume that the first business system to be launched needs to verify the table structure of the first database based on the data standard library, which includes information of multiple standard data elements. As shown in FIG. 3, the method includes:
  • Step 201 The data standard management system obtains information about the first database table structure of the first business system to be launched.
  • the business personnel can establish a database table structure for the first business system. There may be one or more database table structures of the first business system.
  • the database table structure is customized by the business personnel.
  • the business personnel can provide the database table structure maintained by the business personnel to the data standard management system for the data standard management system. Perform the verification of the information of the database table structure. For example, the business person can input the information of the database table structure into the data standard management system through one edit request or multiple edit requests through the second terminal device he uses.
  • the processing methods of other database table structures refer to the first database table structure.
  • the information of the first database table structure includes information of at least one data element.
  • the information of each data element in the information of the first database table structure includes the attribute information of the data element; in some scenarios, when the value corresponding to the data element is an enumerable value, the information of the data element also includes a code set.
  • the data element information can also be divided into data element identification and data element description information.
  • the data element identification is used to identify the corresponding data element, and the data element identification may include the data element name and/or the data element code (also Called data element number), the name of the data element may include a Chinese name and/or an English name, and the data element code may be a string of numbers and/or letters used to identify the corresponding data element.
  • the data element description information is used to describe the data element, which includes the part of the attribute information except the data element identification.
  • the data element description information further includes a code set. For example, when the value corresponding to the data element is an enumerable value, it also includes a code set.
  • the information about the table structure of the first database further includes information about the first database and information about the first table.
  • the database table structure established by the business system needs to be stored in a designated space for easy maintenance. Therefore, usually, a business person needs to establish a database before the database table structure to store the information of the data elements included in the first database table structure.
  • the aforementioned first database is a database used to store the table structure of the first database.
  • the information of the first database may include attribute information such as the database identifier, address, and/or structure.
  • the database identifier is used to identify the corresponding database, and the database identifier may include the database name and/or the database code.
  • the database name may include a Chinese name and/or an English name
  • the database code may be a string of numbers and/or letters used to identify the corresponding database.
  • the first table information is information other than data element information in the first database table, and the first table information may include attribute information such as the identifier and/or structure of the database table.
  • the database table identifier is used to identify the corresponding database table
  • the database table identifier can include the database table name and/or the database table code
  • the database table name can include the Chinese name and/or the English name
  • the database table code can be used to identify the corresponding database table A string of numbers and/or letters.
  • the data standard management system may obtain information of the first database table structure of the first business system to be launched in various ways.
  • the embodiment of the present application uses the following two methods as examples for illustration:
  • the information of the table structure of the first database is acquired through online acquisition.
  • the information of the first database table structure can be carried in a data standard document that can be edited online.
  • the information of the first database table structure mainly includes the information of at least one data element
  • the process of obtaining the information of the first database table structure of the first business system to be launched mainly includes the process of obtaining the information of the at least one data element.
  • the business personnel can input the information of multiple data elements into the data standard management system one by one through online editing through the second terminal device they use.
  • the data standard management system receives the information of multiple data elements one by one through online acquisition. .
  • Other information in the information of the table structure of the first database can also be input to the data standard management system through online editing, and the data standard management system receives the information of the first database table structure Other information in the.
  • the data element identifier of the first data element is one of the data element identifiers of multiple standard data elements stored in the data standard library
  • the information of the table structure of the first database and the standard data element stored in the data standard library The information includes data element identification and data element description information.
  • the data standard management system can obtain the information of the first data element through the data element identification of the first data element. For the process of obtaining the information of other data elements, refer to the process of obtaining the information of the first data element.
  • the business personnel can input the data element identification of the first data element into the data standard management system through the second terminal device they use.
  • the data standard management system receives the data element identification of the first data element; since the first data element belongs to multiple standard data elements stored in the data standard library, the data standard library stores the information of the first data element .
  • the data standard management system can query in the data standard library based on the data element identifier of the first data element, so as to obtain the data element description information corresponding to the data element identifier of the first data element in the data standard library, so as to be based on the first data element.
  • the data element identification of the meta element and the data element description information obtain the information of the first data element.
  • the business personnel can input the data element identifier of the first data element by themselves based on their own experience.
  • the data standard management system may provide an input box on the user interface, and the business person enters the data element identifier of the first data element in the input box.
  • the data standard management system supports a keyword search function.
  • the data element identification of the first data element can be obtained by searching among the data element identifications of multiple standard data elements stored in the data standard library through the keyword corresponding to the first data element.
  • the data standard management system can provide a search box on the user interface, and business personnel can enter the keyword corresponding to the first data element in the search box, and the data standard management system stores multiple standard data in the data standard library based on the keyword.
  • the data element identifier of the first data element is searched in the data element identifier of the meta.
  • the keyword is a character used to index the data element identification of the first data element, and it may consist of one or more characters.
  • the keyword may include: English characters, Chinese characters and/or numeric characters (also called numerical characters). ).
  • the data standard management system also supports a search prompt function to ensure that business personnel effectively determine the data element identifier of the first data element. For example, before receiving the data element identifier of the first data element, the data standard management system may match the received keyword with the data element identifiers of multiple standard data elements stored in the data standard library; and output the matching result.
  • the matching result includes The information of at least one second standard data element, and the data element identifier of each second standard data element in the at least one second standard data element matches the keyword.
  • the business personnel can obtain the information of each second standard data element related to the keyword in the matching result, so as to select the first data element they want from at least one second standard data element.
  • the data standard management system receives the data element identifier of the corresponding first data element.
  • the aforementioned matching result usually includes information about a designated number of second standard data elements, and the designated number is an integer greater than 1, which can improve the reference of the matching result to business personnel.
  • the algorithm for the data standard management system to match the received keyword with the data element identifiers of multiple standard data elements may be a fuzzy matching algorithm, such as a search algorithm in ElasticSearch (ES for short).
  • the fuzzy matching algorithm refers to a certain degree of accuracy matching according to the proposed conditions or requirements.
  • the principle of fuzzy matching is to search for the exact same content as the searched content first, and then search for the content that is close to the searched content.
  • the fuzzy matching algorithm also allows the partial literal order of the search keywords to be reversed or spaced. Search content can include keywords' synonyms, synonyms, related words, and phrases containing keywords, etc.
  • the matching result obtained by using the fuzzy matching algorithm can include both the exact matching result and the result other than the exact matching.
  • the matching result of the fuzzy matching The content of the second standard data element is more extensive, and the number of information of the second standard data element obtained is large, thereby improving the reference of the matching result to the business personnel.
  • the exact matching algorithm means that the matching condition is to determine the match when the search keyword and the data element identifier of the standard data element are exactly the same, and the matching restriction is precise and strict.
  • the matching algorithm provided in the embodiment of the present application may also be other algorithms, which is not limited.
  • the data standard management system also supports the conditional search function.
  • the corresponding search prompt function includes not only the corresponding results of the keyword search function, but also the corresponding results of the conditional search function.
  • the data standard management system can provide a condition input box in the user interface, and business personnel can input search conditions in the condition input box, and the data standard management system based on the search conditions in the data element identification of multiple standard data elements stored in the data standard library The search results in the identification of the data element that meets the search criteria. Further, the business personnel can perform a conditional search while performing a keyword search. Accordingly, the data standard management system will meet the search conditions and the data element identification of the second standard data element information that matches the keyword. In this way, a more accurate recommendation result of the information of the second standard data element can be provided to the business personnel. The query time of the business personnel in the information of multiple second standard data elements is reduced, so that the business personnel can quickly select the information of the standard data elements they want.
  • the search condition may include: release time, competent department information, and/or standard category, and so on.
  • the release time refers to the release time of the data standard to which the standard data element in the data standard library belongs, such as 2018
  • the competent department information refers to the information of the manager of the data standard to which the standard data element in the data standard library belongs , Such as a certain electrical appliance industry association
  • the standard category refers to the category of the data standard to which the standard data element in the data standard library belongs, such as security or product.
  • Fig. 4 is a schematic diagram of a schematic interface provided by the data standard management system in an embodiment of the present application.
  • the matching result output by the data standard management system includes data elements
  • the identification ( Figure 4 takes the data element identification as the Chinese name of the data element as an example for illustration).
  • the information of the two second standard data elements of "integer test field” and “integer test field 1" are retrieved. There is no restriction on the release time in the process of the information of the second standard data element, and the business personnel can select a second standard data element as the first data element based on the matching result.
  • the information included in the matching result may also have other forms.
  • the matching result only includes the data element identifier of at least one second standard data element; or, the matching result only includes the data element description information of at least one second standard data element; or, the matching result includes the aforementioned at least one second standard data element.
  • the meta information it also includes the matching degree between the data element identifier of each second standard data element in at least one second standard data element and the keyword, so as to improve the reference of the matching result to business personnel.
  • the information of the multiple second standard data elements may be arranged in a first specified order.
  • the first specified order can be implemented by the following two exemplary implementation manners:
  • the information of the multiple second standard data elements is sorted in descending order according to the matching degree between the data element identifier of the second standard data element and the keyword (that is, sorted in descending order of the matching degree) .
  • the matching degree P1 satisfies the first matching degree calculation formula:
  • M is the number of characters of the data element identifier of the second standard data element that is the same as the keyword, which is equivalent to the number of characters corresponding to the intersection of the data element identifier of the second standard data element and the keyword;
  • the foregoing calculation of the matching degree using the first matching degree calculation formula is only an illustrative method for obtaining the matching degree provided in the embodiment of the present application.
  • the method for obtaining the matching degree may also have other methods, and reference may be made to the traditional The method of obtaining the matching degree, such as the method of obtaining the matching degree of the fuzzy matching algorithm (such as ES).
  • the data standard management system outputs the second standard data with the data element identifier as "integer test field" in the matching result output by the data standard management system.
  • the information of the element is arranged before the information of the second standard data element whose data element is identified as "integer test field 1".
  • the information of the multiple second standard data elements is sorted in descending order of the priority of the data standard to which the second standard data element belongs (that is, sorted in descending order of priority).
  • the priority of the data standard may include standard priority or time priority.
  • the standard priority refers to the priority of the standard itself.
  • the order of the standard priority from high to low is national standard, line standard and landmark.
  • Time priority is usually that the closer the release time or implementation time is to the current one, the higher the priority.
  • the data element After sorting in descending order according to the standard priority of the data standard to which the second standard data element belongs, in the matching result output by the data standard management system, the data element is identified as the second standard of "integer test field 1" The information of the data element is arranged before the information of the second standard data element whose data element is identified as "integer test field”.
  • the first specified order may also have other manners.
  • the first specified order is an order determined by combining the foregoing first and second exemplary implementation manners. That is, the information of the multiple second standard data elements is sorted according to the degree of matching between the data element identifier of the second standard data element and the keyword and the priority of the data standard to which the second standard data element belongs. For example, for each second standard data element, the data standard management system can obtain the degree of matching between the data element identifier of the second standard data element and the keyword, and obtain the priority of the data standard to which the second standard data element belongs.
  • the data standard management system sorts the information of each second standard data element according to the sort indication value corresponding to each second standard data element. It is usually sorted in descending order according to the sorting indicator value.
  • the information of the table structure of the first database is obtained by receiving an offline edited data standard document.
  • the data standard documents include information on the table structure of the first database, and transmit the generated data standard documents to the data standard management system.
  • the data standard management system receives the data standard document.
  • the data standard document may be a document of multiple data types, for example, a data document of a database table type.
  • the business person can run a third-party modeling tool on the second terminal device, generate a data standard document through the third-party modeling tool, and then input the data standard document into the data standard management system.
  • the third-party modeling tool may be a data modeling tool such as E-Rwin or PowerDesigner.
  • the data standard management system can also support the access of a third-party modeling tool.
  • the third-party modeling tool can access (for example, query) the data standard library to obtain
  • the data standard library stores the information of standard data elements, and generates data standard documents based on the information of the standard data elements. That is, the third-party modeling tool can support the generation of data standard documents that meet the requirements of the data standard library.
  • a business person can run a third-party modeling tool on the second terminal device, and the third-party modeling tool can generate a data standard document based on the data standard library.
  • the data standard management system receives data standard documents generated by a third-party modeling tool based on the data standard library.
  • the third-party modeling tool may be a data modeling tool such as E-Rwin or PowerDesigner.
  • the data standard management system since the generation rules of the data standard documents of the third-party modeling tools are different from those required by the data standard library, the data standard management system obtains the data standards generated by the third-party modeling tools After the document, the information of the first database table structure needs to be verified to obtain the information of the first database table structure that meets the requirements of the data standard library; and in the second optional way, because third-party modeling tools can support For the generation of data standard documents that meet the requirements of the data standard library, the information of the corresponding first database table structure fully or partially meets the requirements of the data standard library, so the calculation cost of verification can be reduced, and the verification cost can be reduced.
  • the data standard management system may also output a data dictionary template, which is a reference template for the information of the first database table structure; the business person may send the data dictionary template to the second terminal device based on the data dictionary template.
  • the data standard management system inputs the information of the first database table structure, such as the information of single or batch data elements.
  • the data standard management system receives the information of the first database table structure input based on the data dictionary template.
  • the specific method for obtaining the information of the first database table structure may refer to the first optional method or the second optional method of obtaining the information of the first database table structure of the first business system to be launched.
  • the data dictionary template can be as shown in Table 1.
  • the data dictionary template includes the parameters involved in the data element information as shown in the first row of Table 1.
  • the parameters involved in Table 1 include the database name, the English name of the physical table, and the English name of the data element. And the Chinese name of the data element and other parameters as an example; the data dictionary template also includes the explanation information (or description information) of the parameters involved in the data element information as shown in the second row of Table 1, the explanation information Used to explain the meaning of each corresponding parameter.
  • parameter optional value
  • the corresponding explanation information is: "optional value and description of data, such as gender: F, female; M, male".
  • the data dictionary template also includes examples of filling in the information of a total of 6 data elements as shown in the third to ninth rows of Table 1 (that is, the information corresponding to one data element in each row), which is used to prompt business personnel how to fill in the data elements Information.
  • Table 1 is only a schematic example provided by the embodiments of this application.
  • the data dictionary template may also have other forms, as long as it can achieve the purpose of providing a reference for the information entered by the business personnel. .
  • the data standard management system outputs data dictionary templates for reference by business personnel, so that business personnel no longer rely solely on their own experience to formulate data element information, but formulate data element information on a basis, so as to improve the acquisition
  • the accuracy of the information in the first database table structure reduces the complexity and computational cost of the subsequent verification process.
  • Step 202 The data standard management system verifies the information of the table structure of the first database based on the data standard library.
  • the data standard management system needs to verify the information based on the data standard library after obtaining the information of the first database table structure.
  • the verification of the information of the first database table structure may at least include the following data standard symbolic verification and data standard normative verification:
  • the first aspect is the symbolic verification of data standards.
  • the verification process refers to verifying the information of at least one data element based on the information of the standard data element in the data standard library.
  • the information of the standard data element is the aforementioned "symbol”.
  • the information of the at least one data element is determined based on the information of the standard data element in the data standard library.
  • the process includes:
  • Step A1 The data standard management system compares the information of the first data element with the information of multiple standard data elements stored in the data standard library.
  • the information of standard data elements usually stored in the form of code in the data standard library, in order to effectively match the information of the first data element with the information of multiple standard data elements stored in the data standard library, usually first
  • the information of the standard data element is translated into information that can be compared with the information of the first data element, for example, information in the form of text.
  • the data standard management system compares the information of the first data element with the information of multiple standard data elements stored in the data standard library, mainly to determine whether the information of multiple standard data elements exists and that of the first data element.
  • the information of a data element matches the information. That is, it is to determine whether the information of each standard data element matches the information of the first data element.
  • determining whether the information of a standard data element matches the information of the first data element depends on the matching condition.
  • the matching condition is that the information of the standard data element is the same as the information of the corresponding parameter in the information of the first data element.
  • the parameter information such as the database name of the standard data element, the English name of the physical table, the English name of the data element, and the Chinese name of the data element corresponds to the corresponding parameter information of the first data element.
  • the information of the data element includes attribute information (for example, the parameter information except for optional values in the first row of Table 1 is attribute information); in some scenarios, it may also include a code set (as shown in Table 1
  • the matching condition is that the attribute information of the standard data element is the same as the information of the corresponding parameter in the attribute information of the first data element.
  • Table 1 that is, the database name of the standard data element, the English name of the physical table, the English name of the data element and the Chinese name of the data element and other parameter information (that is, the information about the parameters other than the optional values) and
  • the corresponding parameter information of the first data element corresponds to the same.
  • the process of the data standard management system comparing the information of the first data element with the information of multiple standard data elements includes a process of comparing the attribute information of the first data element with the attribute information of the multiple standard data elements.
  • the matching condition is that the attribute information of the standard data element is the same as the attribute information of the corresponding parameter in the information of the first data element, and the attribute information of the standard data element and the first data element are related to If there are many parameters in the data standard library, the probability that there is information matching the information of the first data element in the data standard library is low, and the matching efficiency is low.
  • the matching condition is that the information of the specified parameter in the attribute information of the standard data element is the same as the information of the corresponding parameter in the attribute information of the first data element.
  • the specified parameter usually belongs to the parameters describing the attributes of the data element itself, such as parameters such as name, definition, structure, and/or value rules.
  • the name can include parameters such as the English name of the data element
  • the definition can include character type and character length. And/or character precision and other parameters. Since the specified parameters can describe the attributes of the data element and the number of parameters is also small, the use of this matching condition can ensure that there is a higher probability of information matching the information of the first data element in the data standard library, and improve matching efficient.
  • Step A2 When the information of the first data element does not match the information of multiple standard data elements in the data standard library, the data standard management system sends first modification prompt information, and the first modification prompt information indicates to update the first data element. Data element information.
  • the data standard management system may send the first modification prompt information to the second terminal device, and the second terminal device presents the first modification prompt information to the business personnel, so that the business personnel can update the information of the first data element.
  • Step A3 After receiving the updated information of the first data element that matches the information of any standard data element among the multiple standard data elements, the data standard management system determines that the verification of the information of the first data element is successful.
  • the data standard management system After receiving the updated information of the first data element, the data standard management system compares the updated information of the first data element with the information of multiple standard data elements. For this comparison step, refer to the aforementioned step A1; When the information of a data element does not match the information of multiple standard data elements in the data standard library, the data standard management system sends the first modification prompt information data element identification again. For this prompt step, please refer to the aforementioned step A2; data standard After the management system receives the updated information of the first data element again, it repeats the aforementioned comparison step, the prompt step, and the receiving step of the first data element information, until any one of the multiple standard data elements in the data standard library is standard data. The information of the element matches the information of the updated first data element.
  • the data standard management system determines that the verification of the information of the first data element is successful.
  • the business personnel can use the first modification prompt message sent multiple times by the data standard management system to achieve multiple modifications of the information of the first data element to meet the requirements of the standard data element in the data standard library, so that the business personnel can Define the information of the data element consistent with the information of the standard data element of the data standard library.
  • the information of the first database table structure is carried in the data standard document that can be edited online or offline. Therefore, when business personnel update the information of the first data element, it is usually where the first data element is located.
  • the information of the first database table structure is updated, that is, it is updated in the data standard document.
  • the data standard management system needs to locate the information of the updated first data element in the information of the updated first database table structure, and then verify the information of the updated first data element.
  • the data standard management system receives the updated information about the structure of the first database table, it determines the value of the first data element from the information of all data elements in the information about the updated first database table structure. information. For example, the data standard management system may scan all the data elements of the information of the first database table structure, so as to locate the information of the first data element among the information of all the data elements. This process is called full calibration.
  • the incremental data element may be determined from the updated information of the first database table structure, and the The information of the first data element is determined in the information of the data element. This process is called incremental verification.
  • the data standard management system establishes an update instruction rule.
  • the business personnel update the data element information through the second terminal device, they can update the data element information according to the update instruction rule, so that the data standard management system can effectively locate To incremental data elements.
  • the data standard management system carries the information of the first database table structure in the first modification prompt message, and adds a remark field to the information of the first database table structure
  • the data standard management system sends the first modification prompt message
  • the process is equivalent to the rollback of the information of the first database table structure. If the information of the first database table structure is carried in the data standard document, the document rollback is realized.
  • the business person After receiving the information of the first database table structure through the second terminal device, the business person updates the data element information and adds target remark information in the remark field, which indicates the updated data element .
  • the data standard management system After that, after the data standard management system receives the information of the updated first database table structure, it can determine the updated data element by querying the remark field, that is, the incremental data element, and then locate the incremental data element To the first data element.
  • the information of the first database table structure includes 6 rows of data element information.
  • the business personnel After receiving the information of the first database table structure through the second terminal device, the business personnel will check the first row of data elements and the third row of data elements. The information is updated, and the target remark information indicating the data element in the first row and the data element in the third row is added in the remark field. Then the data standard management system can determine that the first row of data elements and the third row of data elements are incremental data elements by querying the memo field.
  • the data standard management system carries the information of the first database table structure in the first modification prompt message, and adds a plug-in to the information of the first database table structure.
  • the business person receives the information of the first database table structure through the second terminal device, he updates the data element information therein, and the plug-in automatically identifies the updated data element.
  • the data standard management system receives the updated information of the first database table structure, it can determine the updated data element through the plug-in identification, that is, the incremental data element, and then locate the incremental data element The first data element.
  • the plug-in can identify the updated data element by adding comments, highlighting, and/or adding a specified color.
  • the information of the first database table structure includes 6 rows of data element information.
  • the business personnel After receiving the information of the first database table structure through the second terminal device, the business personnel will check the first row of data elements and the third row of data elements. The information is updated. The plug-in highlights the information of the first row of data elements and the third row of data elements. Then the data standard management system determines the first row of data elements and the third row of data elements that have been highlighted as incremental data elements.
  • the number of data elements queried by the data standard management system can be reduced, and the efficiency of determining updated data elements can be improved.
  • the aforementioned first modification prompt information may include information of at least one first standard data element, and the data element identifier of each first standard data element in the at least one first standard data element is the same as the data element of the first data element. Meta ID matches.
  • the business personnel can obtain the information of each first standard data element related to the data element identification of the first data element in the first modification prompt message, so as to select the one they want to modify from the at least one first standard data element The first data element.
  • the aforementioned first modification prompt information usually includes information of a designated number of first standard data elements, and the designated number is an integer greater than 1, which can improve the reference of the first modification prompt information to business personnel.
  • the algorithm for the data standard management system to match the received data element identifier of the first data element with the data element identifiers of multiple standard data elements may be a fuzzy matching algorithm, that is, each of the at least one first standard data element The data element identification of the first standard data element matches the data element identification of the first data element.
  • the fuzzy matching algorithm is a search algorithm in ElasticSearch.
  • the matching result obtained by using the fuzzy matching algorithm can include both the exact matching result and the result other than the exact matching.
  • the matching result of the fuzzy matching The content of the data element is more extensive, and the information of the first standard data element obtained is large, thereby improving the reference of the matching result to the business personnel.
  • the matching algorithm may also be other algorithms, which is not limited in the embodiment of the present application.
  • the second specified order can be implemented in the following two exemplary implementation manners:
  • the information of the multiple first standard data elements is sorted in descending order according to the matching degree between the data element identifier of the first standard data element and the keyword (that is, sorted in descending order of the matching degree) .
  • the information of the multiple first standard data elements is sorted in descending order of the priority of the data standard to which the first standard data element belongs (that is, sorted in descending order of priority).
  • the second specified order may also have other manners.
  • the second specified order is an order determined by combining the foregoing first and second exemplary implementation manners. That is, the information of the multiple first standard data elements is sorted according to the degree of matching between the data element identifier of the first standard data element and the data element identifier of the first data element and the priority of the data standard to which the first standard data element belongs.
  • the data standard management system can obtain the degree of matching between the data element identifier of the first standard data element and the data element identifier of the first data element, and obtain the degree to which the first standard data element belongs The priority of the data standard, and assign a value according to the specified rule to the priority of the data standard to which the first standard data element belongs, where the priority is positively related to the assigned value; then, the matching degree and priority are assigned respectively in advance
  • the ordering indication value of the first standard data element is determined by means of weighted summation.
  • the data standard management system sorts the information of each first standard data element according to the sort indication value corresponding to each first standard data element. It is usually sorted in descending order according to the sorting indicator value.
  • step 202 the content of the data element identification of the first data element and the content of the keyword may be different, but both include one or more characters. This will not be repeated here.
  • the data standard management system may also add some remark fields according to the verification situation to remind the business personnel of information that needs attention.
  • the second data element is one of at least one data element included in the information of the first database table structure.
  • the value corresponding to the second data element is an enumerable value
  • add data element remark information for the second data element and the data element remark information is used to identify the enumerable value corresponding to the second data element.
  • the data element remark information may be added to the second data element to add accurate code set information.
  • the name of the second data element is: age, and data element remarks are added to the second data element. The data element remarks are used to identify 120 enumerable values from 1 to 120 corresponding to the second data element.
  • the target database table structure determined based on the first database table structure may still include adding data element remark information for the second data element.
  • the first business system is online, if you need to collect the data corresponding to the second data element, you can directly collect the data in the enumerable value format corresponding to the second data element to ensure that the final collected data conforms to the format of the data standard library Requirements, that is, compliance with relevant standards.
  • the second aspect is the normative verification of data standards.
  • the verification process refers to the verification of the normativity of the data standard information. Mainly verify the format of the information of the data standard.
  • the process of verifying the information of the table structure of the first database based on the data standard library, which includes the information of multiple standard data elements includes:
  • Step B1 The data standard management system detects whether the format of the information of the first database table structure meets the specified format requirements.
  • the information of the first database table structure includes information of at least one data element, and may also include information of the first database and first table information. Then the data standard management system detects whether the format of the information in the first database table structure meets the specified format requirements.
  • the process includes: detecting whether the format of each data element meets the specified data element format requirements, such as whether the English name of the data element is Designated characters (such as uppercase English letters); check whether the format of the information in the first database meets the specified database format requirements, such as the English name of the database is composed of designated characters (such as uppercase English letters), and whether the length of the database code is less than the first Specify the length threshold, the first specified length threshold can be 60 bits (bit refers to the number of digits); check whether the format of the first table information meets the specified database table structural formula requirements, such as whether the database table identifier contains the specified characters ( Such as capital English letters), whether the length of the database table code is less than the second specified length threshold, and the second specified length threshold may be 60 bits.
  • the specified data element format requirements such as whether the English name of the data element is Designated characters (such as uppercase English letters)
  • the specified database format requirements such as the English name of the database is composed of designated characters (such as uppercase English letters)
  • Step B2 When the format of the information of the first database table structure does not meet the specified format requirements, the data standard management system sends a second modification prompt message, which indicates the format of updating the information of the first database table structure.
  • the data standard management system may send the second modification prompt information to the second terminal device, and the second terminal device presents the second modification prompt information to the business personnel, so that the business personnel can update the format of the information of the first database table structure .
  • Step B3 After receiving the updated information of the first database table structure whose format meets the format requirements, the data standard management system determines that the format verification of the information of the first database table structure is successful.
  • the data standard management system After the data standard management system receives the updated information of the first database table structure, it detects whether the format of the information of the first database table structure meets the specified format requirements. This detection step can refer to the aforementioned step B1; when the first database table The format of the structure information still does not meet the specified format requirements.
  • the data standard management system sends the second modification prompt message again. For this prompt step, please refer to the aforementioned step B2; the data standard management system again receives the updated first database table structure After the information is received, repeat the foregoing detection step, prompt step, and receiving step of the information of the first database table structure until the format of the information of the first database table structure meets the specified format requirements.
  • the data standard management system determines that the format verification of the information of the first database table structure is successful.
  • the business personnel can realize the multiple modification of the format of the information of the first database table structure through the second modification prompt message sent multiple times by the data standard management system, so as to meet the requirements of the data standard management system for the format of the standard information. , So that business personnel can define and meet the requirements of the database table structure information.
  • the aforementioned second modification prompt message usually indicates which information in the information of the first database table structure does not meet the specified format requirements, such as the information of the data element or the information of the first database or the information of the first table does not meet the requirements of the specified format. Corresponding format requirements.
  • the data standard management system can scan all the information of the information in the first database table structure, and detect whether the scanned information meets the corresponding format requirements.
  • detecting the information that did not meet the corresponding format requirements last time without detecting all the information in the first database table structure.
  • the information of the first database table structure created for the first business system such as the first table information, data element information, and code set, can be used for automatic matching and verification, and the system administrator can only
  • the simple inspection that needs to be performed greatly saves manpower input, and due to the reduction of manual review links, manual errors can be reduced, and the accuracy of the final target database table structure can be improved. Compared with traditional technology, it can save more than 75% of labor cost.
  • the original process of filling in, transmitting, comparing, filling in review comments, feedback, modification, and re-reviewing through the database table online is simplified to the process of automatic verification, comparison and modification, submission for review, and review result feedback, which reduces
  • the offline information transmission process and links save information transmission time, further simplify the work process, and improve work efficiency, which can usually be increased by more than 70%.
  • Step 203 After verifying the information of the first database table structure, the data standard management system obtains the target database table structure of the first business system, and the target database table structure is determined based on the first database table structure.
  • the target database table structure is the database table structure adopted by the first business system after it goes online. It is essentially obtained in advance before the first business goes online, except that the target database table structure needs to be used after the first business goes online.
  • the information about the table structure of the target database includes part or all of the information about the table structure of the first database. In an optional manner, the information of the table structure of the first database may be directly determined as the information of the table structure of the target database.
  • the information of the table structure of the first database needs to be further adjusted to obtain the information of the table structure of the target database.
  • the data elements of different business systems may have some changes according to the actual situation.
  • the data standard database does not record the corresponding standard data element information, but for a certain business system, the data element information is allowed to be added In the business system.
  • the information of these data elements cannot be effectively verified in the data standard management system. Therefore, after the aforementioned step 202, that is, after the data standard management system automatically verifies the information of the first database table structure, the data standard management system also Support the manual secondary verification of the information of the first database table structure.
  • the secondary verification process may include:
  • the data standard management system sends the information of the first data element to the designated terminal device after determining that the information of the first data element is successfully verified.
  • the designated terminal device is a terminal device of a verification personnel used for secondary verification, and it may be a terminal device of a system administrator or a project administrator. For example, assuming that the verification personnel is a system administrator, the designated terminal device is the aforementioned third terminal device.
  • the verifier After receiving the information of the first data element through the designated terminal device, the verifier determines whether the information of the first data element needs to be modified, and based on the determination result, sends the first verification response information to the data standard management system through the designated terminal device, The first verification response information is used to indicate to modify the information of the first data element, or to indicate that the second verification of the information of the first data element is successful.
  • the data standard management system receives the first verification response information, and sends the first verification response information to the second terminal device of the business person.
  • the business personnel can modify the information of the first data element through the second terminal device, and send it to the designated terminal device through the data standard management system again , The verification is performed by the verification personnel until the first verification response information received by the second terminal device indicates that the second verification of the information of the first data element is successful.
  • the business person does not need to modify the information of the first data element.
  • the data standard management system after determining that the format verification of the information of the first database table structure is successful, sends the information of the first database table structure to the designated terminal device.
  • the verifier receives the information of the first database table structure through the designated terminal device, it determines whether the format of the information of the first database table structure needs to be modified, and based on the determination result, sends the second calibration to the data standard management system through the designated terminal device.
  • Verification response information the second verification response information is used to indicate that the format of the information of the first database table structure is modified, or indicates that the secondary verification of the format of the information of the first database table structure is successful.
  • the data standard management system receives the second verification response information, and sends the second verification response information to the second terminal device of the business person.
  • the business personnel can modify the format of the information of the first data element through the second terminal device, and then through the data standard management system again It is sent to the designated terminal device, and the verification is performed by the verification personnel until the second verification response information received by the second terminal device indicates that the second verification of the format of the information of the first data element is successful.
  • the business person does not need to modify the format of the information of the first data element.
  • the information of the first data element and the information of the first database table structure can be carried in the same verification request (also called verification application) and sent to the designated terminal device, the first verification response information and the second verification
  • the verification response information can be the same information, so as to reduce the number of interactions between the data standard management system and each terminal device and save network overhead.
  • the information about the table structure of the target database may be information about the table structure of the first database after the manual secondary verification is successful.
  • the third modification prompt message can be sent to prompt the business personnel to further adjust the information of the first database table structure to obtain the target database table structure.
  • the first database table structure whose information of the first database table structure meets the requirements of the bid-winning rate and/or the requirement of the matching rate is finally determined as the target database table structure.
  • the requirement of the bid rate refers to that the bid rate of the information in the first database table structure is greater than the specified bid rate threshold.
  • the bidding rate is the ratio of the actual number of bid-winning data elements in the information of the first database table structure to the number of bid-accepting data elements in the information of the first database table structure.
  • the actual bid data element is a data element that matches the standard data element, and the definition of the match can refer to the definition in the aforementioned step A1.
  • the data element that should be marked is the data element whose business identifier is the same as the business identifier of the standard data element (such as the English name of the data element), but does not match the standard data element (that is, only the condition that the business identifier in the matching condition is the same is met ).
  • the requirement of matching rate means that the matching rate of the information in the first database table structure is greater than the specified matching rate threshold.
  • the matching ratio the number of data elements that should be awarded and the total number of data elements included in the database table structure of the first business system.
  • Step 204 The data standard management system outputs the script corresponding to the target database table structure.
  • the data standard management system supports the script output function. After verifying and auditing the information of the table structure of the first database, and obtaining the information of the table structure of the target database, the data standard management system can generate and output a script (also called a table building script) corresponding to the table structure of the target database.
  • the script is used to generate information about the target database table structure, which includes the database table building statement of the first business system.
  • the business person can receive the script through the second terminal device. After the first business system is online, the business personnel can load and run the script in the first business system.
  • the script after running can generate the information of the target database table structure and obtain the corresponding information according to the information construction of the target database table structure. Database table structure.
  • the data standard management system outputs scripts without the need for business personnel to write scripts by themselves, which reduces the workload of business personnel, thereby saving labor costs.
  • the data standard management system may also send a request for using the target database table structure to the first terminal device of the project administrator, and the request carries information about the target database table structure, and the project administrator can be based on After the information of the target database table structure is online, the database table is established. After the construction is completed, the data standard management system sends a database table usage notification to the second terminal device of the business personnel to notify the business personnel that the target database table structure can be used after the first business system is online. In this way, business personnel can avoid the construction of database tables.
  • the aforementioned steps 201 to 204 can be performed by the data standard management device 102 in the data standard management system 10 shown in FIG.
  • the data standard management system 10 also supports the life cycle management function corresponding to the subsequent step 205 to step 206, and the data management function shown in step 207.
  • the life cycle management function is performed by the life cycle management device 103
  • the data management function is performed by the data management device 104.
  • step 205 to step 206 are as follows:
  • Step 205 The data standard management system receives the data standard library operation request.
  • the data standard library operation request includes a standard data element addition request, a standard data element update request, a standard data element deletion request, or a standard data element query request.
  • various data standard database operations on the data standard database are supported, such as standard data element addition operations, standard data element update operations, standard data element deletion operations, and standard data element query operations.
  • the corresponding data standard database operation requests are respectively a standard data element addition request, a standard data element update request, a standard data element deletion request, or a standard data element query request.
  • the standard data element addition request is used to request information of one or more data elements to be added to the data standard library.
  • the system administrator can collect national standards, industry standards or landmarks, split the collected data standards into multiple data elements, and add the information of the multiple data elements to the data standard database through one or more standard data element addition requests .
  • Each standard data element addition request can carry single or batch data element information. For example, if a landmark adds some data elements to a new business system, the data elements need to be added to the data standard library to guide the business system construction of the industry or subordinate units. If the data standards are unified, the system administrator Add this part of the data element through the standard data element addition request. For another example, if a standard data element is added to the national standard or industry standard, the system administrator can add the data element through a standard data element addition request to keep pace with the national standard or industry standard.
  • the system administrator can collect data elements based on the data standard management system.
  • the data standard management system can also output a data element collection template.
  • the data element collection template is a reference template for data element collection. Its structure can refer to the aforementioned data dictionary template;
  • the data element collection template inputs data element information, such as single or batch data element information, to the data standard management system through the second terminal device.
  • the data standard management system sends the received data element information to the third terminal device of the system administrator for the system administrator's reference.
  • a business person can directly enter data elements in the data element collection template through the second terminal device to obtain an updated data element collection template, and the data standard management system sends the updated data element collection template to the system administrator’s The third terminal device.
  • the standard data element update request is used to request to update the information of one or more data elements in the data standard library. For example, if the national standard or industry standard updates a standard data element, the system administrator requests to update the data element through the standard data element update request to keep pace with the national standard or industry standard.
  • the standard data element deletion request is used to request to delete the information of one or more data elements in the data standard database. For example, when the data standard corresponding to a certain national standard, line standard or landmark is out of use, the system administrator can delete the information of multiple data elements corresponding to the data standard in the data standard library through one or more standard data element deletion requests.
  • the data standard management system when the data standard management system deletes a certain standard data element from the data standard library in the subsequent process, it can add a delete mark to the standard data element, and does not physically delete the standard data element, so that Follow-up query.
  • the deletion flag indicates that the standard data element has been discarded, and it can also carry the reason for discarding the standard data element, for example, the use of the data standard has been discontinued.
  • the standard data element query request is used to request to query the data element information in the data standard library.
  • the data standard management system can also output a data element template, which is a reference template for the added data element; based on the data element template, the operator can send the data element to the corresponding terminal device.
  • the data standard management system inputs the data element information that needs to be carried in the standard data element addition request, and correspondingly, the data standard management system receives the data element information input based on the data element template.
  • the data element template can be as shown in Table 2.
  • the data element template includes the parameters involved in the data element information as shown in the first row of Table 2.
  • the parameters involved in Table 2 include the basic data classification name, the basic data classification code, and the identification of the data element.
  • the parameters such as the Chinese name of the symbol and the data element are described as examples;
  • the data element template also includes the explanation information (or description information) of the parameters involved in the data element information as shown in the second row of Table 2.
  • Information is used to explain the meaning of each corresponding parameter.
  • the corresponding explanation information is: "the classification to which the data element belongs, the value is as follows: XX public: XX basic; department public: XX department bureau_public information; department system: XX department bureau _system name".
  • the data element template also includes examples of filling in the information of a total of 3 data elements as shown in the third to fifth rows of Table 2 (that is, the information corresponding to one data element in each row), which is used to prompt the operator how to fill in the data element information.
  • Table 2 is only a schematic example provided by the embodiment of the present application.
  • the data element template may also have other forms, as long as it can achieve the purpose of providing a reference for the operator to input the information of the data element. .
  • the data standard management system outputs data element templates for reference by operators, so that operators no longer rely solely on their own experience to formulate data element information, but rather formulate data element information on a basis, so as to improve the input to The accuracy of the information of the standard data element of the standard database.
  • Step 206 After successfully authenticating the data standard library operation request, the data standard management system executes the operation corresponding to the data standard library operation request on the data standard library.
  • the data standard database stores the information of multiple standard data elements
  • the information of these standard data elements is the basis for the verification of the database table structure of each business system. If the information of standard data elements is added, deleted or modified at will, it will It causes confusion in the management of information in the data standard library, and causes the information in the data standard library to lose its valid reference for verification. Therefore, when performing data standard library operations on the data standard library, the data standard library operation request needs to be authenticated. After the authentication is successful, the data standard management system can perform the operation corresponding to the data standard library operation request on the data standard library. .
  • the staff involved include project administrators, business personnel, and system administrators. These staff members need to register a corresponding account in the data standard management system.
  • the corresponding operation information carries the account to identify the identity of the operator.
  • Staff with different identities have different account levels, and have different operating rights to the data standard management system, so the corresponding authentication methods are also different.
  • the data operation When the data operation is a standard data element addition operation, a standard data element update operation, or a standard data element deletion operation, the data operation involves a higher level of confidentiality. Normal business personnel are not allowed to perform related operations. Also, because the data standard management system is maintained by the system administrator, the system administrator usually has the authority to manage the data standard library, and it can perform operations with a higher level of confidentiality. However, the system administrator is not necessarily a professional who corresponds to the project, so he may not know the data standard corresponding to the project. Therefore, relevant personnel are also required to perform auxiliary authentication for data operations. When the data operation is a data element query operation, the confidentiality level involved in the data operation is low, which can usually be viewed by project administrators, business personnel, and system administrators. However, in order to avoid malicious access and reduce leaks, relevant personnel can also be added to perform auxiliary authentication for data operations.
  • the first authentication method authentication for data operations with a high confidentiality level.
  • the data standard management system detects whether the account carried in the data standard request is a first-level account, and the first level is greater than Specify the level threshold.
  • the first level account is the account of the system administrator.
  • the data standard management system When the data standard management system detects that the account carried in the data standard request is not a first-level account, it determines that the authentication of the data standard database operation request fails.
  • the data standard management system determines that the authentication of the data standard library operation request is successful; in another In an optional manner, the data standard management system sends the data standard library operation request to the terminal device corresponding to the second-level account, and after receiving the permission instruction indicating that the operation of the data standard library is allowed, determines the operation of the data standard library The request authentication is successful; after receiving the prohibition instruction indicating that the operation of the data standard library is not allowed, it is determined that the authentication of the operation request of the data standard library has failed.
  • the second level is higher than or equal to the first level, and the account of the second level is different from the account of the first level.
  • the second-level account is the account of the project administrator
  • the terminal device corresponding to the corresponding second-level account is the aforementioned first terminal device.
  • the project administrator determines whether the corresponding person is allowed to operate the data standard library according to the content of the request and the account number carried in the request. If the data standard library is allowed to operate, Then, the permission instruction is sent through the first terminal device, and if the operation of the data standard library is not allowed, the prohibition instruction is sent through the first terminal device.
  • the second authentication method authentication of data operations with low confidentiality levels.
  • the data standard management system detects whether the account number carried in the data standard request is an account assigned by the data standard management system, that is, a legal account in the data standard management system, for example,
  • the account assigned by the data standard management system is any one of the accounts of project administrators, business personnel, and system administrators.
  • the data standard management system When the data standard management system detects that the account carried in the data standard request is not an account assigned by the data standard management system, it determines that the authentication of the data standard library operation request fails.
  • the data standard management system When the data standard management system detects that the account number carried in the data standard request is an account assigned by the data standard management system, in an optional manner, the data standard management system determines that the authentication of the data standard library operation request is successful; In an optional manner, the data standard management system sends the data standard library operation request to the terminal device corresponding to the third-level account, and after receiving the permission instruction indicating that the operation of the data standard library is allowed, determines the data standard The library operation request is successfully authenticated; after receiving a prohibition instruction indicating that the operation of the data standard library is not allowed, it is determined that the authentication of the data standard library operation request has failed.
  • the third level is higher than or equal to the level of the account carried in the aforementioned data standard request, and the third level of account is different from the account carried in the data standard request.
  • the third-level account is the account of the project administrator or system administrator.
  • the terminal device corresponding to the corresponding third-level account is the aforementioned third terminal device .
  • the system administrator determines whether the corresponding person is allowed to operate the data standard library according to the content of the request and the account number carried in the request. If the operation of the data standard library is allowed, Then, the permission instruction is sent through the third terminal device, and if the operation of the data standard library is not allowed, the prohibition instruction is sent through the third terminal device.
  • a data operation response may also be sent to indicate completion of the operation corresponding to the data standard library operation request performed on the data standard library, or indicate that the operation is successful.
  • a data operation response may be sent to indicate that the operation corresponding to the data standard library operation request is prohibited from being performed on the data standard library, or the operation fails.
  • the data standard library operations of the data standard management system also include operations on code sets, such as code set addition operations, code set update operations, code set deletion operations, and code set query operations.
  • the corresponding data standard library operation requests are respectively a code set addition request, a code set update request, a code set deletion request, or a code set query request.
  • the corresponding process can refer to the process of adding, updating, deleting and querying the aforementioned standard data element, which is not described in detail in the embodiment of the present application.
  • the security of the data in the data standard library can be ensured, the reliability of the information of the standard data elements in the data standard library can be ensured, and the leakage of secrets can be effectively prevented.
  • the data standard management system supports the addition (also called release), update, deletion, and query of data elements, as well as the maintenance of code sets, the review of data elements, and other maintenance functions. It is worth noting that the data standard management system can also manage multiple data standard information in the form of database tables or documents for users to consult and refer to. These series of functions for data standards or data elements can be called data element life cycle management. Based on the generation cycle management, the whole process and all-round control of the data standard can be realized. In the embodiments of the present application, the data standard management system can also perform systematic management on the documents corresponding to the data standard, for example, in the form of webpages to create different pages by industry, level, and version, so that users can read and use them.
  • the data standard management system may generate an operation log for each data operation.
  • the data standard management system can query the operation log corresponding to the data standard library after receiving the query instruction or periodically; when the operation log includes the abnormal operation log, it will issue an abnormal alarm.
  • System administrators can determine whether data operations need to be backtracked based on abnormal alarms, and realize data rollback, thereby effectively maintaining the data standard management system.
  • Step 207 The data standard management system performs data management based on the artificial intelligence model.
  • the data standard management system also supports data governance functions.
  • the data governance process corresponding to the data governance function can refer to the traditional data governance process.
  • the data standard management system is pre-established with an artificial intelligence model, and data governance is performed through the artificial intelligence model.
  • the data governance process can include the following steps:
  • Step C1 The data standard management system obtains information about the second database table structure of the second business system after it is online, where the information about the second database table structure includes information about at least one data element.
  • step C1 can refer to the process of step 201 described above, which is not limited in the embodiment of the present application.
  • Step C2 When the data standard management system detects that the information of the third data element does not match the information of the multiple standard data elements based on the artificial intelligence model, it determines the target standard corresponding to the third data element among the multiple standard data elements Data element, the third data element is one data element of at least one data element included in the information of the second database table structure.
  • the data standard management system can input the information of the second database table structure into the artificial intelligence model, and the artificial intelligence model detects whether the information of the third data element matches the information of multiple standard data elements, and the information of the third data element is compared with the information of multiple standard data elements. When the information of the two standard data elements does not match, the target standard data element corresponding to the third data element is determined among the multiple standard data elements.
  • Step C3 Based on the artificial intelligence model, the data standard management system establishes a mapping relationship between the third data element and the target standard data element.
  • mapping relationship After the artificial intelligence model establishes the mapping relationship between the third data element and the target standard data element, the mapping relationship can be output for subsequent use in providing data services.
  • the artificial intelligence model can be obtained by training on the information of multiple standard data elements and the information of sample data elements.
  • the accuracy and efficiency of the establishment of the mapping relationship can be improved, thereby improving the effect of data governance.
  • steps 206 and 207 can be located before step 201.
  • steps 206 and 207 can be located before step 201.
  • the traditional data standard management system is a database table maintained in a business system after the business system is online. The structure is checked; if the database table structure does not meet the requirements of the target data standard stored in the data standard management system corresponding to the business system, the data standard management system will establish a mapping relationship between the database table structure and the target data standard. This process is called the data governance process.
  • the database table structure of the business system is verified before the business system goes online.
  • the online service system may be online for the first time (that is, online after newly-built), or online after transformation.
  • the embodiments of the present application schematically provide a scenario where a business system goes online after transformation.
  • This scenario is a platform as a service (Platform as a Service, PaaS) scenario.
  • PaaS Platform as a Service
  • a project needs to transform multiple business systems corresponding to the project into business system microservices.
  • the business system needs to be split and built according to microservices.
  • the new business system will involve the creation of the database table structure. Since the old business system may be built by multiple manufacturers, the design of the database table structure by the manufacturers is also inconsistent. In this way, using the data processing method provided in the embodiment of the present application can realize that the business systems of all manufacturers define the database table structure before going online.
  • the unified standard landing audit and monitoring landing can also be realized. At the same time, it can be internally or externally based on the needs of business personnel and standard administrators. Publish data standards.
  • the embodiment of the present application schematically illustrates the actual implementation process of the processing method of the foregoing database table structure.
  • the business person A of the first business system applies for an account in the data standard management system.
  • the data standard management system pushes the application to the system administrator for approval. If the system administrator passes the approval, it means that the business person A is allowed to The first business system is operated.
  • Business person A can create a new database, edit the structure of the database, and/or delete the database in the first business system.
  • business person A can create database tables through online editing or offline editing (including importing database tables or importing data element information, etc.). In the process of editing database tables, you can create new data elements, edit data elements, or delete data elements.
  • the business data processing method checks the information of the database table structure of the business system before the business system goes online, so as to ensure that the business system can use accurate target database tables after the business system goes online. structure.
  • the target database table structure has higher reliability, which improves the quality of data provided after the business system goes online, reduces the probability of data conversion after the business system goes online, and reduces the cost of subsequent data governance.
  • the data standard management system includes:
  • the first obtaining module 301 is configured to obtain information about the structure of a first database table of the first business system to be launched, where the information of the first database table structure includes information about at least one data element; the verification module 302 is used for data-based The standard library verifies the information of the first database table structure, and the data standard library includes the information of a plurality of standard data elements; the second acquisition module 303 is configured to verify the information of the first database table structure, A target database table structure of the first business system is obtained, and the target database table structure is determined based on the verified first database table structure.
  • the verification module verifies the information on the database table structure of the business system, so as to ensure that the business system can be used accurately after going online.
  • the target database table structure Compared with traditional technologies, the target database table structure has higher reliability, which improves the quality of data provided after the business system goes online, reduces the probability of data conversion after the business system goes online, and reduces the cost of subsequent data governance.
  • the verification module 302 is configured to: when the information of the first data element does not match the information of the multiple standard data elements, send first modification prompt information, the first modification prompt information instructing to update the The information of the first data element, the first data element is one of the at least one data element; after receiving the updated first data element that matches the information of any one of the multiple standard data elements After the information of the data element, it is determined that the verification of the information of the first data element is successful.
  • the information of the first database table structure and the information of the standard data element both include a data element identifier
  • the first modification prompt information includes information of at least one first standard data element
  • the at least one first standard data element The data element identifier of each first standard data element in the data element is fuzzy matching with the data element identifier of the first data element.
  • the first modification prompt information includes a plurality of information of the first standard data element, and the information of the plurality of first standard data elements is based on the data element identifier of the first standard data element and the data of the first data element
  • the meta-identifiers are sorted in descending order of matching degree; and/or, sorted in descending order according to the priority of the data standard to which the first standard data element belongs.
  • the data standard management system further includes: a first receiving module, configured to receive updated information about the structure of the first database table;
  • the first determining module is configured to: determine an incremental data element in the updated information of the first database table structure, and determine the information of the first data element in the information of the incremental data element; or , Determining the information of the first data element from the information of all data elements in the information of the structure of the first database table after the update.
  • the verification module 302 is configured to: when the format of the information of the first database table structure does not meet the specified format requirements, send second modification prompt information, the second modification prompt information instructing to update the first database table The format of the structure information; after receiving the updated information of the first database table structure whose format meets the format requirements, it is determined that the format verification of the information of the first database table structure is successful.
  • the data standard management system also includes:
  • the remark module is used to add data element remarks to the second data element when the value corresponding to the second data element is an enumerable value after obtaining the information of the first database table structure of the first business system to be launched Information, the data element remark information is used to identify the enumerable value corresponding to the second data element, and the second data element is one of the at least one data element.
  • the information of the first database table structure and the information of the standard data element both include data element identification and data element description information
  • the first acquisition module 301 is configured to: receive the data element identification of the first data element ,
  • the data element identifier of the first data element is one of the data element identifiers of a plurality of standard data elements stored in the data standard library; the data element corresponding to the data element identifier of the first data element is obtained in the data standard library Description.
  • the first obtaining module 301 is configured to: receive a data standard document, the data standard document including the information of the first database table structure.
  • the first obtaining module 301 is configured to receive the data standard document generated by a third-party modeling tool based on the data standard library.
  • the first obtaining module 301 is configured to: output a data dictionary template, where the data dictionary template is a reference template for the information of the first database table structure;
  • the data standard management system further includes: a second receiving module for receiving a data standard database operation request, the data standard database operation request including a standard data element addition request, a standard data element update request, and a standard data element deletion request Or standard data element query request; the operation module is used to perform the operation corresponding to the data standard library operation request on the data standard library after the authentication of the data standard library operation request is successful.
  • a second receiving module for receiving a data standard database operation request, the data standard database operation request including a standard data element addition request, a standard data element update request, and a standard data element deletion request Or standard data element query request; the operation module is used to perform the operation corresponding to the data standard library operation request on the data standard library after the authentication of the data standard library operation request is successful.
  • the data standard management system further includes: a third obtaining module, configured to obtain information about a second database table structure of the second business system after going online, where the information about the second database table structure includes information about at least one data element Information; the second determination module is used to determine the third data element in the plurality of standard data elements when it is detected that the information of the third data element does not match the information of the plurality of standard data elements based on the artificial intelligence model
  • the target standard data element corresponding to the element, the third data element is a data element of at least one data element included in the information of the second database table structure;
  • the establishment module is used to establish the third data element based on the artificial intelligence model The mapping relationship with the target standard data element.
  • the data standard management system also includes:
  • the output module is configured to output the script corresponding to the target database table structure after obtaining the target database table structure of the first business system.
  • the structure of the data standard management system provided by the embodiment of the present application can also refer to the data standard management system shown in FIG. 1 and FIG. 2, where, as shown in FIG. 7, the data standard management device 102 may include business modeling Module 1021, standard review module 1022 and service use module 1023; life cycle management device 103 may include a standard formulation module 1031 and a data standard library operation module 1032; data management device 104 includes a data standard module 1041 and a metadata module 1042.
  • the business modeling module 1021 can complete the functions of the aforementioned first acquisition module 301, verification module 302, and second acquisition module 303, that is, perform the actions of the aforementioned step 201 to step 202;
  • the standard review module 1022 can be used to perform the aforementioned two The second verification process;
  • the service use module 1023 is used to send the target database table structure use request to the first terminal device of the project manager after the target database table structure is determined, and the project manager uses the first terminal device to construct the database table After completion, a notification of using the database table is sent to the second terminal device of the business person.
  • the standard setting module 1031 is used to establish the information of the standard data elements in the data standard library.
  • the system administrator can specify the information of the standard data elements offline and upload it to the standard database; the data standard library operation module 1032 can complete the aforementioned second receiving module And the function of the operation module, that is, the actions from step 205 to step 206 are performed.
  • the data standard module 1041 can complete the functions of the aforementioned third acquisition module, second determination module, and establishment module, that is, execute the aforementioned step 207.
  • the metadata module 1042 can be used to set timed tasks, regularly collect information (such as metadata) of the database tables of the managed business system, check the information of the database tables and data elements created or updated by the business system, and perform data standardization For verification, the process can refer to the corresponding process in the foregoing step 202. This can reduce the format of the information in the business system that does not meet the requirements.
  • FIG. 8 schematically provides a possible basic hardware architecture of the computing device described in this application.
  • the computing device 400 includes a processor 401, a memory 402, a communication interface 403, and a bus 404.
  • the number of processors 401 may be one or more, and FIG. 8 only illustrates one of the processors 401.
  • the processor 401 may be a central processing unit (CPU). If the computing device 400 has multiple processors 401, the types of the multiple processors 401 may be different or may be the same. Optionally, multiple processors 401 of the computing device 400 may also be integrated into a multi-core processor.
  • the memory 402 stores computer instructions and data; the memory 402 can store computer instructions and data required to implement the data redistribution method provided by the present application. For example, the memory 402 stores instructions for implementing the steps of the data redistribution method.
  • the memory 402 may be any one or any combination of the following storage media: non-volatile memory (for example, read only memory (ROM), solid state drive (SSD), hard disk (HDD), optical disc), volatile memory.
  • the communication interface 403 may be any one or any combination of the following devices: a network interface (for example, an Ethernet interface), a wireless network card, and other devices with a network access function.
  • the communication interface 403 is used for data communication between the computing device 400 and other computing devices or terminals.
  • the bus 404 can connect the processor 401 with the memory 402 and the communication interface 403. In this way, through the bus 404, the processor 401 can access the memory 402, and can also use the communication interface 403 to interact with other computing devices or terminals.
  • the computing device 400 executes computer instructions in the memory 402, so that the computing device 400 implements the data redistribution method provided in this application, or causes the computing device 400 to deploy a data standard management system.
  • non-transitory computer-readable storage medium including instructions, such as a memory including instructions, which can be executed by a processor of a server to complete the data processing shown in each embodiment of the present application.
  • the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
  • the program can be stored in a computer-readable storage medium.
  • the storage medium mentioned can be a read-only memory, a magnetic disk or an optical disk, etc.

Abstract

一种数据处理方法及数据标准管理系统,属于数据处理领域。所述方法包括:获取待上线的第一业务系统的第一数据库表结构的信息,所述第一数据库表结构的信息包括至少一个数据元的信息;基于数据标准库对所述第一数据库表结构的信息进行校验,所述数据标准库包括多个标准数据元的信息;在对所述第一数据库表结构的信息校验后,获取所述第一业务系统的目标数据库表结构,所述目标数据库表结构基于校验后的所述第一数据库表结构确定。所述方法及系统能够提高业务系统上线后提供的数据的质量,减少业务系统上线后数据转化的概率。所述方法及系统用于业务系统的数据处理。

Description

数据处理方法及数据标准管理系统 技术领域
本申请涉及数据处理领域,特别涉及一种数据处理方法及数据标准管理系统。
背景技术
数据标准是指定群体(如某一行业或某一组织)内需共同遵守的数据含义和业务规则。数据标准通常以表结构(也称数据库表结构)的形式表示,表结构中包括多个业务字段。表结构中的每个业务字段称为一个数据元(也称数据元素),是数据标准的基本组成单元。数据元的信息均是符合业务规则的,该信息包括名称、定义、结构和取值的规则等内容。
目前,在一些业务系统中维护有自身的数据库表结构,该数据库表结构是业务系统提供数据服务时所依据的数据处理条件。为了实现数据标准的管理,提出一种数据标准管理系统,该数据标准管理系统存储有一个或多个行业所需遵守的数据标准的信息,该数据标准可以包括国家标准(简称国标)、行业标准(简称行标)和/或地方标准(简称地标)。在某一业务系统上线后,数据标准管理系统可以对该业务系统中维护的数据库表结构的信息进行稽查;若该数据库表结构不符合数据标准管理系统所存储的与该业务系统对应的目标数据标准的要求,数据标准管理系统会建立该数据库表结构与目标数据标准的映射关系(即数据库表结构的数据元与目标数据标准的数据元的映射关系),在业务系统每次对外提供数据服务时,数据标准管理系统基于该映射关系将符合数据库表结构定义的条件的数据转化为符合目标数据标准的数据,再输出转化后的数据。
但是数据标准管理系统管理的业务系统,在每次对外提供数据服务时,均需要进行数据转化,影响提供数据服务的效率。
发明内容
本申请实施例提供了一种数据处理方法及数据标准管理系统。所述技术方案如下:
第一方面,提供一种数据处理方法,该方法包括:
获取待上线的第一业务系统的第一数据库表结构的信息,该第一数据库表结构的信息包括至少一个数据元的信息;基于数据标准库对该第一数据库表结构的信息进行校验,该数据标准库包括多个标准数据元的信息;在对该第一数据库表结构的信息校验后,获取该第一业务系统的目标数据库表结构,该目标数据库表结构基于校验后的该第一数据库表结构确定。
本申请实施例提供的业务数据处理方法,在业务系统上线前,对该业务系统的数据库表结构的信息进行校验,从而保证业务系统在上线后可以采用准确的目标数据库表结构。相较于传统技术,目标数据库表结构的可靠性较高,从而提高了业务系统上线后提供的数据的质量,减少了业务系统上线后数据转化的概率,降低了后期数据治理的成本。
数据标准管理系统获取待上线的第一业务系统的第一数据库表结构的信息的方式可以有多种,本申请实施例以以下两种方式为例进行说明:
在第一种可选方式中,通过在线获取的方式获取第一数据库表结构的信息。
该第一数据库表结构的信息和该标准数据元的信息均包括数据元标识和数据元描述信息,该获取待上线的第一业务系统的第一数据库表结构的信息,包括:
接收该第一数据元的数据元标识,该第一数据元的数据元标识为该数据标准库存储的多个标准数据元的数据元标识中的一个;在该数据标准库中获取该第一数据元的数据元标识对应的数据元描述信息。
在一种可能实现中,数据标准管理系统支持关键字搜索功能。该第一数据元的数据元标识是通过该第一数据元对应的关键字在该数据标准库存储的多个标准数据元的数据元标识中搜索得到的。
在一种可能实现中,数据标准管理系统还支持搜索提示功能,以保证业务人员有效地确定第一数据元的数据元标识。
在一种可能实现中,在该接收第一数据元的数据元标识之前,该方法还包括:将接收的该关键字与该数据标准库存储的多个标准数据元的数据元标识进行匹配;输出匹配结果,该匹配结果包括该至少一个第二标准数据元的信息,该至少一个第二标准数据元中每个第二标准数据元的数据元标识均与该关键字匹配。
在一种可能实现中,前述将接收的关键字与多个标准数据元的数据元标识进行匹配的算法可以为模糊匹配算法,其中,模糊匹配算法指的是根据所提出的条件或者要求,给予一定精确程度的匹配。模糊匹配的原则是先搜索与被搜索的内容一模一样的内容,搜索不到再去搜索很接近的内容。在本申请实时中,模糊匹配算法还允许搜索用的关键字的部分字面顺序颠倒或有间隔。搜索内容可以包括关键字的同义词、近义词、相关词、以及包含关键字的短语等。
采用模糊匹配算法的得到的匹配结果既可以包括精确匹配的结果,又可以包括除精确匹配之外的结果,相较于单纯采用精确匹配算法来获取第二标准数据元的信息,模糊匹配所匹配的内容更加广泛,获取的第二标准数据元的信息的个数多,从而提高匹配结果对业务人员可参考性。其中,精确匹配算法指的是匹配条件是在搜索的关键字与标准数据元的数据元标识二者字面完全一致时才确定匹配,匹配限制精确严格。
可选的,若匹配得到的第二标准数据元有多个时,在匹配结果中,多个第二标准数据元的信息可以按照第一指定顺序排列。例如,该第一指定顺序可以由以下两种示意性实现方式实现:
在第一种示意性实现方式中,多个第二标准数据元的信息按照第二标准数据元的数据元标识与关键字的匹配度降序排序(即按照匹配度从大到小的顺序排序)。
示例的,对于任一第二标准数据元,该第二标准数据元的数据元标识与关键字的匹配度的计算方式可以有多种,例如该匹配度P1满足第一匹配度计算公式:
P1=M/N;
其中,M为该第二标准数据元的数据元标识与关键字相同的字符数,相当于第二标准数据元的数据元标识与关键字交集所对应的字符数;N为该第二标准数据元的数据元标识的字符数与关键字的字符数中的最大字符数。
在第二种示意性实现方式中,多个第二标准数据元的信息按照第二标准数据元所属的数据标准的优先级降序排序(即按照优先级从高到低的顺序排序)。
可选的,数据标准的优先级可以包括标准优先级或时间优先级。该标准优先级指的是标准自身的优先级。标准优先级从高到低的顺序排序依次是国标、行标和地标。时间优先级通常是发布时间或实施时间距离当前越近,优先级越高。
值得说明的是,第一指定顺序还可以有其他方式,例如该第一指定顺序是将前述第一种和第二种示意性实现方式进行结合确定的顺序。也即是,多个第二标准数据元的信息按照第二标准数据元的数据元标识与关键字的匹配度以及第二标准数据元所属的数据标准的优先级排序。例如,对于每个第二标准数据元,数据标准管理系统可以获取该第二标准数据元的数据元标识与关键字的匹配度,以及获取该第二标准数据元所属的数据标准的优先级,并按照指定规则为该第二标准数据元所属的数据标准的优先级赋值,其中,优先级与所赋的数值正相关,也即是优先级越高,数值越高;接着,按照预先为匹配度和优先级分别分配的权值,基于该第二标准数据元对应的匹配度和优先级,通过加权求和的方式确定该第二标准数据元的排序指示值。最终,数据标准管理系统按照各个第二标准数据元对应的排序指示值进行各个第二标准数据元的信息的排序。通常按照排序指示值降序排序。
通过按照第一指定顺序对多个第二标准数据元的信息排序,可以提高对业务人员有效的提示,提高提示命中率。
在第二种可选方式中,通过接收线下已编辑的数据标准文档的方式获取第一数据库表结构的信息。
在一种可能实现中,该获取待上线的第一业务系统的第一数据库表结构的信息,包括:
接收数据标准文档,该数据标准文档包括该第一数据库表结构的信息。
其中,该第三方建模工具可以访问(如查询)数据标准库,获取数据标准库中存储标准数据元的信息,并基于标准数据元的信息来进行数据标准文档的生成。相应的,该接收数据标准文档的过程,包括:接收第三方建模工具基于该数据标准库生成的该数据标准文档。
由于第三方建模工具可以支持满足数据标准库要求的数据标准文档的生成,相应的第一数据库表结构的信息全部或部分符合数据标准库的要求,因此可以减少校验的运算代价,降低校验成本。
在一种可能实现中,该获取待上线的第一业务系统的第一数据库表结构的信息,包括:
输出数据字典模板,该数据字典模板为该第一数据库表结构的信息的参考模板;
接收基于该数据字典模板输入的该第一数据库表结构的信息。
数据标准管理系统通过输出数据字典模板,以供业务人员进行参考,使得业务人员不再单纯靠自身经验来制定数据元的信息,而是有所依据地制定数据元的信息,从而可以提高获取的第一数据库表结构的信息的准确性,减少后续校验过程的复杂度和运算代价。
本申请实施例中,对第一数据库表结构的信息进行校验至少可以包括以下数据标准符号性校验和数据标准规范性校验共两种可选方式:
第一种可选方式,数据标准符号性校验。该校验过程指的是基于数据标准库中的标 准数据元的信息对至少一个数据元的信息进行校验。标准数据元的信息即前述“符号”。
在一种可能实现中,该基于数据标准库对该第一数据库表结构的信息进行校验的过程,包括:
当第一数据元的信息与该多个标准数据元的信息均不匹配时,发送第一修改提示信息,该第一修改提示信息指示更新该第一数据元的信息,该第一数据元为该至少一个数据元中的一个数据元;在接收到与该多个标准数据元中任一标准数据元的信息匹配的更新后的该第一数据元的信息后,确定对该第一数据元的信息校验成功。
业务人员可以通过数据标准管理系统多次发送的第一修改提示信息,实现第一数据元的信息的多次修改,以达到数据标准库中标准数据元的要求,使得业务人员可以定义出与数据标准库的标准数据元的信息一致的数据元的信息。
在一种可能实现中,该第一数据库表结构的信息和该标准数据元的信息均包括数据元标识,该第一修改提示信息包括至少一个第一标准数据元的信息,该至少一个第一标准数据元中每个第一标准数据元的数据元标识均与该第一数据元的数据元标识模糊匹配。例如该模糊匹配算法为ElasticSearch中的搜索算法。
采用模糊匹配算法的得到的匹配结果既可以包括精确匹配的结果,又可以包括除精确匹配之外的结果,相较于单纯采用精确匹配算法来获取第一标准数据元的信息,模糊匹配所匹配的内容更加广泛,获取的第一标准数据元的信息的个数多,从而提高匹配结果对业务人员可参考性。
在一种可能实现中,若匹配得到的第一标准数据元有多个时,则在第一修改提示信息中,多个第一标准数据元的信息可以按照第二指定顺序排列。例如,该第二指定顺序可以由以下两种示意性实现方式实现:
在第一种示意性实现方式中,多个第一标准数据元的信息按照第一标准数据元的数据元标识与关键字的匹配度降序排序(即按照匹配度从大到小的顺序排序)。
在第二种示意性实现方式中,多个第一标准数据元的信息按照第一标准数据元所属的数据标准的优先级降序排序(即按照优先级从高到低的顺序排序)。
值得说明的是,第二指定顺序还可以有其他方式,例如该第二指定顺序是将前述第一种和第二种示意性实现方式进行结合确定的顺序。也即是,多个第一标准数据元的信息按照第一标准数据元的数据元标识与第一数据元的数据元标识的匹配度以及第一标准数据元所属的数据标准的优先级排序。例如,对于每个第一标准数据元,数据标准管理系统可以获取该第一标准数据元的数据元标识与第一数据元的数据元标识的匹配度,以及获取该第一标准数据元所属的数据标准的优先级,并按照指定规则为该第一标准数据元所属的数据标准的优先级赋值,其中,优先级与所赋的数值正相关;接着,按照预先为匹配度和优先级分别分配的权值,基于该第一标准数据元对应的匹配度和优先级,通过加权求和的方式确定该第一标准数据元的排序指示值。最终,数据标准管理系统按照各个第一标准数据元对应的排序指示值进行各个第一标准数据元的信息的排序。通常按照排序指示值降序排序。
通过按照第二指定顺序对多个第一标准数据元的信息排序,可以提高对业务人员有效的提示,提高提示命中率。
在一种可能实现中,该方法还包括:接收更新后的该第一数据库表结构的信息;
在该更新后的该第一数据库表结构的信息中确定增量的数据元,并在该增量的数据元的信息中确定该第一数据元的信息;或者,在该更新后的该第一数据库表结构的信息的全量的数据元的信息中确定该第一数据元的信息。
第二种可选方式,数据标准规范性校验。该校验过程指的是对数据标准的信息的规范性进行校验。主要校验数据标准的信息的格式。
该基于数据标准库对该第一数据库表结构的信息进行校验,包括:
当第一数据库表结构的信息的格式不符合指定格式要求时,发送第二修改提示信息,该第二修改提示信息指示更新该第一数据库表结构的信息的格式;在接收到格式符合该格式要求的更新后的该第一数据库表结构的信息后,确定对该第一数据库表结构的信息的格式校验成功。
业务人员可以通数据标准管理系统多次发送的第二修改提示信息,实现第一数据库表结构的信息的格式的多次修改,以达到数据标准管理系统对标准信息的格式的要求,使得业务人员可以定义出与复合要求的数据库表结构的信息。
在一种可能实现中,在该获取待上线的第一业务系统的第一数据库表结构的信息之后,该方法还包括:
当第二数据元对应的数值为可枚举数值时,为该第二数据元添加数据元备注信息,该数据元备注信息用于标识该第二数据元对应的可枚举数值,该第二数据元为该至少一个数据元中的一个数据元。
如此,在后续第一数据库表结构的信息校验完成后,在采用基于该第一数据库表结构确定的目标数据库表结构中可以仍然包括为第二数据元添加数据元备注信息。在第一业务系统上线后,若需要采集该第二数据元对应的数据,可以直接按照第二数据元对应的可枚举数值的格式采集数据,以保证最终采集的数据符合数据标准库的格式要求,也即是符合相关标准。
在一种可能实现中,该方法还包括:
接收数据标准库操作请求,该数据标准库操作请求包括标准数据元添加请求、标准数据元更新请求、标准数据元删除请求或标准数据元查询请求;在对该数据标准库操作请求鉴权成功后,对该数据标准库执行该数据标准库操作请求所对应的操作。
本申请实施例示意性地提出以下几种鉴权方式:
第一种鉴权方式:高保密级别的数据操作的鉴权。
当数据操作请求是标准数据元添加请求、标准数据元更新请求或标准数据元删除请求时,数据标准管理系统检测该数据标准请求中携带的账号是否为第一等级的账号,该第一等级大于指定等级阈值,示例的,该第一等级的账号为系统管理员的账号。
当数据标准管理系统检测该数据标准请求中携带的账号不为第一等级的账号时,确定对数据标准库操作请求鉴权失败。
当数据标准管理系统检测到该数据标准请求中携带的账号为第一等级的账号时,在一种可选方式中,数据标准管理系统确定对数据标准库操作请求鉴权成功;在另一种可选方式中,数据标准管理系统向第二等级的账号所对应的终端设备发送该数据标准库操作请求,在接收到指示允许对数据标准库进行操作的允许指令后,确定对数据标准库操作请求鉴权成功;在接收到指示不允许对数据标准库进行操作的禁止指令后,确定对数 据标准库操作请求鉴权失败。其中,第二等级高于或等于第一等级,该第二等级的账号与第一等级的账号不同。例如,第二等级的账号为项目管理员的账号,则相应的第二等级的账号所对应的终端设备为前述第一终端设备。项目管理员在通过第一终端设备接收到数据标准库操作请求后,根据请求的内容以及请求中携带的账号确定是否允许对应的人员对数据标准库进行操作,若允许对数据标准库进行操作,则通过第一终端设备发送允许指令,若不允许对数据标准库进行操作,则通过第一终端设备发送禁止指令。
第二种鉴权方式:低保密级别的数据操作的鉴权。
当数据操作请求是标准数据元查询请求时,数据标准管理系统检测该数据标准请求中携带的账号是否为数据标准管理系统分配的账号,也即是数据标准管理系统中的合法账号,示例的,该数据标准管理系统分配的账号为项目管理员、业务人员以及系统管理员的账号中的任一账号。
当数据标准管理系统检测该数据标准请求中携带的账号不为数据标准管理系统分配的账号时,确定对数据标准库操作请求鉴权失败。
当数据标准管理系统检测到该数据标准请求中携带的账号为数据标准管理系统分配的账号时,在一种可选方式中,数据标准管理系统确定对数据标准库操作请求鉴权成功;在另一种可选方式中,数据标准管理系统向第三等级的账号所对应的终端设备发送该数据标准库操作请求,在接收到指示允许对数据标准库进行操作的允许指令后,确定对数据标准库操作请求鉴权成功;在接收到指示不允许对数据标准库进行操作的禁止指令后,确定对数据标准库操作请求鉴权失败。
其中,第三等级高于或等于前述数据标准请求中携带的账号的等级,该第三等级的账号与数据标准请求中携带的账号不同。例如,第三等级的账号为项目管理员或系统管理员的账号,假设第三等级的账号为系统管理员的账号,则相应的第三等级的账号所对应的终端设备为前述第三终端设备。系统管理员在通过第三终端设备接收到数据标准库操作请求后,根据请求的内容以及请求中携带的账号确定是否允许对应的人员对数据标准库进行操作,若允许对数据标准库进行操作,则通过第三终端设备发送允许指令,若不允许对数据标准库进行操作,则通过第三终端设备发送禁止指令。
在前述两种鉴权方式中,若数据标准管理系统对数据标准库操作请求鉴权成功,对数据标准库执行数据标准库操作请求所对应的操作。在一种可能实现中,据标准管理系统还可以发送数据操作响应,指示完成对数据标准库执行数据标准库操作请求所对应的操作,或者指示操作成功。
若数据标准管理系统对数据标准库操作请求鉴权失败,禁止对数据标准库执行数据标准库操作请求所对应的操作。在一种可能实现中,据标准管理系统还可以发送数据操作响应,指示禁止对数据标准库执行数据标准库操作请求所对应的操作,或者指示操作失败。
在一种可能实现中,该方法还包括:查询该数据标准库对应的操作日志;当该操作日志中包括异常操作日志,发出异常告警。系统管理员可以基于异常告警确定是否需要进行数据操作的回溯,实现数据的回滚,从而有效维护数据标准管理系统。
本申请实施例中,数据标准管理系统还支持数据治理功能。在一种可选示例中,数据标准管理系统预先建立有人工智能模型,通过人工智能模型进行数据治理。该数据治 理过程可以包括以下步骤:
获取上线后的第二业务系统的第二数据库表结构的信息,该第二数据库表结构的信息包括至少一个数据元的信息;当基于人工智能模型检测到第三数据元的信息与该多个标准数据元的信息均不匹配时,在该多个标准数据元中确定与该第三数据元对应的目标标准数据元,该第三数据元为该第二数据库表结构的信息包括的至少一个数据元的一个数据元;基于该人工智能模型,建立该第三数据元与该目标标准数据元的映射关系。
在一种可能实现中,数据标准管理系统支持脚本输出功能。在该获取该第一业务系统的目标数据库表结构后,该方法还包括:输出该目标数据库表结构对应的脚本。该脚本用于生成目标数据库表结构的信息,其包括第一业务系统的数据库建表语句。业务人员通过第二终端设备可以接收该脚本。在第一业务系统在上线后,业务人员可以在第一业务系统中加载并运行该脚本,运行后的脚本可以生成目标数据库表结构的信息,并按照该目标数据库表结构的信息建设得到对应的数据库表结构。如此,无需业务人员自行编写脚本,减少业务人员的工作量,从而节约人工成本。
第二方面,提供一种数据标准管理系统,该数据标准管理系统可以包括至少一个模块,该至少一个模块可以用于实现上述第一方面或者第一方面的各种可能实现提供的该数据处理方法。
第三方面,本申请提供一种计算设备,该计算设备包括处理器和存储器。该存储器存储计算机指令;该处理器执行该存储器存储的计算机指令,使得该计算设备执行上述第一方面或者第一方面的各种可能实现提供的方法,使得该计算设备部署上述第二方面或者第二方面的各种可能实现提供的该数据标准管理系统。
第四方面,本申请提供一种计算机可读存储介质,该计算机可读存储介质中存储有计算机指令,该计算机指令指示该计算设备执行上述第一方面或者第一方面的各种可能实现提供的方法,或者该计算机指令指示该计算设备部署上述第二方面或者第二方面的各种可能实现提供的数据标准管理系统。
第五方面,本申请提供一种计算机程序产品,该计算机程序产品包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算设备的处理器可以从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算设备执行上述第一方面或者第一方面的各种可能实现提供的方法,使得该计算设备部署上述第二方面或者第二方面的各种可能实现提供的数据标准管理系统。
第六方面,提供一种芯片,该芯片可以包括可编程逻辑电路和/或程序指令,当该芯片运行时用于实现如第一方面任一该的数据处理方法。
附图说明
图1是本申请实施例提供的一种数据处理方法所涉及的数据标准管理系统的应用环境示意图;
图2是本申请实施例提供的另一种数据处理方法所涉及的数据标准管理系统的应用环境示意图;
图3是本申请一示意性实施例提供的一种数据处理方法的流程示意图;
图4是本申请实施例中数据标准管理系统提供的一种示意性的界面示意图;
图5是本申请实施例中数据标准管理系统提供的另一种示意性的界面示意图;
图6是本申请实施例提供的一种数据标准管理系统的结构示意图;
图7是本申请实施例提供的另一种数据标准管理系统的结构示意图;
图8是本申请实施例提供的一种计算机设备的结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
为了便于读者理解,本申请实施例首先对本申请实施例提供的一种数据处理方法所涉及的专有名词进行解释。
数据标准,是指定群体(如某一行业或某一组织)内需共同遵守的数据含义和业务规则。数据标准通常以表结构的形式表示,表结构中包括多个业务字段。由于数据标准对应表结构通常存储在数据库中,因此数据标准对应表结构也称为数据库表结构,该数据库表结构所定义的表为数据库表(也称物理表或数据实体)。
数据元,是指数据库表结构中的一个业务字段,是数据库表结构的基本组成单元。一个数据元的信息包括用于描述数据元的属性的信息。即一个数据元的属性信息。该属性信息可以包括用于描述数据元自身属性的参数的信息,如名称、定义、结构和取值的规则等参数的信息,还可以包括用于描述数据元所属环境的参数的信息,如所属数据库表结构的名称。其中,数据标准中的数据元称为标准数据元。
代码集,是指数据库表结构中可枚举数值的定义。可枚举数值指的是数值存在多例。例如,一个数据元对应的数值为可枚举数值,指的是该数据元中的数值有多个,但数值的个数是有限的。由于一个数据元的名称,数据元内容中的取值,取值的说明等均可以为数值,则对于一个数据元,代码集指的是该数据元对应的数值为可枚举数值时,该可枚举数值的定义。例如一个数据元的名称为:性别,数据元的数据元内容有男和女2个可枚举数值,代码集定义该2个可枚举数值分别用0和1表示。
以下对本申请实施例提供的一种数据处理方法所涉及应用场景进行说明。
在某一行业或某一组织内,执行某一项目涉及一个或多个业务系统。该一个或多个业务系统需要遵循相同的数据标准,如国标、行标或地标。如此可以提高该项目的作业效率以及数据质量,减少作业成本。
示例的,假设属于一组织X的企业建立一个项目,由多个厂家来实现该项目,每个厂家维护有一个或多个业务系统(通常维护一个业务系统),则采用本申请实施例提供的数据处理方法所涉及的数据标准管理系统,可以实现该多个厂家所维护的业务系统采用的数据标准的一致性(即各个厂家采用的数据库表结构均符合数据标准的要求)。其中,本申请实施例将企业的管理员称为项目管理员,将厂家的管理人员称为业务人员,维护数据标准管理系统的工作人员称为系统管理员。
可选的,前述组织X可以为通信组织、电力组织、水利组织或农业组织等等,相应的,业务系统可以为通信业务系统、电力业务系统、水利业务系统或农业业务系统等等。
图1是本申请实施例提供的一种数据处理方法所涉及的数据标准管理系统的应用环境示意图。请参考图1,该数据标准管理系统10包括:
数据标准库101和数据标准管理设备102。数据标准库101与数据标准管理设备102建立有有线或无线的通信连接。数据标准库101可以为一个服务器或者由多个服务器组成的服务器集群。数据标准库101用于存储多个标准数据元的信息。标准数据元指的是已建立的数据标准(如已建立的国标、行标或地标)中的数据元。数据标准管理设备102可以为一个服务器或者由多个服务器组成的服务器集群或者其他计算机设备。数据标准管理设备102用于管理一个或多个业务系统的部分或全部功能,并通过数据标准库101中存储的标准数据元对所管理的业务系统的数据库表结构的信息进行校验。
可选的,如图2所示,该数据标准管理系统10还可以包括:
生命周期管理设备103以及数据治理设备104。生命周期管理设备103以及数据治理设备104分别与数据标准库101建立有有线或无线的通信连接。生命周期管理设备103可以为一个服务器或者由多个服务器组成的服务器集群。生命周期管理设备103用于管理对数据标准库101的操作,如标准数据元的添加操作、更新操作、删除操作或查询操作。数据治理设备104可以为一个服务器或者由多个服务器组成的服务器集群或者云计算中心。数据治理设备104用于进行数据标准库的数据治理,如建立一个数据库表结构中的数据元与数据标准库101中的标准数据元的映射关系。
值得说明的是,当前述数据标准管理系统所需处理的数据量较小或者存在一个数据处理性能较高的设备时,前述数据标准库101、数据标准管理设备102、生命周期管理设备103以及数据治理设备104的多种功能可以由一个设备的对应模块实现,也即是该数据标准库101、数据标准管理设备102、生命周期管理设备103以及数据治理设备104中的至少两个设备的功能集成在一个设备上。例如,该设备可以包括数据标准存储模块、数据标准管理模块、生命周期管理模块以及数据治理模块,分别对应实现前述数据标准库101、数据标准管理设备102、生命周期管理设备103以及数据治理设备104的功能。
可选的,该数据标准管理系统还支持任务管理功能,用于管理各个业务系统,并将管理权限分配至各个业务系统的业务人员。各个业务人员只能管理与分配给自己的业务系统相关的数据库等。
进一步可选的,该数据标准管理系统10还可以包括分别与项目管理员、业务人员以及系统管理员对应的第一终端设备、第二终端设备和第三终端设备。项目管理员、业务人员以及系统管理员可以通过对应的终端设备访问该数据标准管理系统。
图3是本申请一示意性实施例提供的一种数据处理方法的流程示意图,该数据处理方法可以应用于图1或图2所示的数据标准管理系统中。后续实施例假设待上线的第一业务系统需要基于数据标准库进行第一数据库表结构的校验,数据标准库包括多个标准数据元的信息,则如图3所示,该方法包括:
步骤201、数据标准管理系统获取待上线的第一业务系统的第一数据库表结构的信息。
业务人员在新建或重建第一业务系统时,可以为第一业务系统建立数据库表结构。该第一业务系统的数据库表结构可以有一个或多个,该数据库表结构是业务人员自定义 的,业务人员可以将其维护的数据库表结构提供给数据标准管理系统,以由数据标准管理系统进行数据库表结构的信息的校验。例如,业务人员可以通过其使用的第二终端设备通过一次编辑请求或者多次编辑请求将该数据库表结构的信息输入至数据标准管理系统。
假设第一数据库表结构为第一业务系统中的一个数据库表结构,其他的数据库表结构的处理方式参考该第一数据库表结构。在本申请实施例中,第一数据库表结构的信息包括至少一个数据元的信息。第一数据库表结构的信息中的每个数据元的信息包括数据元的属性信息;在一些场景中,当数据元对应的数值为可枚举数值时,数据元的信息还包括代码集。本申请实施例中,数据元的信息还可以划分为数据元标识和数据元描述信息,数据元标识用于标识对应的数据元,数据元标识可以包括数据元名称和/或数据元编码(也称数据元编号),数据元名称可以包括中文名称和/或英文名称,数据元编码可以是用于标识对应数据元的数字和/或字母组成的字符串。数据元描述信息用于描述该数据元,其包括属性信息中除数据元标识的部分。在一些可选情况中,该数据元描述信息还包括代码集,如在数据元对应的数值为可枚举数值时,其还包括代码集。
可选的,该第一数据库表结构的信息还包括第一数据库的信息以及第一表信息。其中,由于业务系统建立的数据库表结构需要存储到指定空间中,以便于维护。因此,通常业务人员在数据库表结构前,需要先建立一个数据库,以用于存储第一数据库表结构包括的数据元的信息。则前述第一数据库是用于存储第一数据库表结构的数据库。第一数据库的信息可以包括该数据库标识、地址和/或结构等属性信息。数据库标识用于标识对应的数据库,数据库标识可以包括数据库名称和/或数据库编码。数据库名称可以包括中文名称和/或英文名称,数据库编码可以是用于标识对应数据库的数字和/或字母组成的字符串。第一表信息是第一数据库表中除数据元的信息之外的信息,该第一表信息可以包括该数据库表标识和/或结构等属性信息。数据库表标识用于标识对应的数据库表,数据库表标识可以包括数据库表名称和/或数据库表编码,数据库表名称可以包括中文名称和/或英文名称,数据库表编码可以是用于标识对应数据库表的数字和/或字母组成的字符串。
数据标准管理系统获取待上线的第一业务系统的第一数据库表结构的信息的方式可以有多种,本申请实施例以以下两种方式为例进行说明:
在第一种可选方式中,通过在线获取的方式获取第一数据库表结构的信息。
第一数据库表结构的信息可以携带在一个可在线编辑的数据标准文档中。第一数据库表结构的信息主要包括至少一个数据元的信息,则获取待上线的第一业务系统的第一数据库表结构的信息的过程主要包括获取该至少一个数据元的信息的过程。业务人员可以通过其使用的第二终端设备通过在线编辑的方式逐个向数据标准管理系统输入多个数据元的信息,相应的,数据标准管理系统通过在线获取的方式逐个接收多个数据元的信息。第一数据库表结构的信息中的其他信息,如第一数据库的信息以及第一表信息也可以通过在线编辑的方式向数据标准管理系统输入,数据标准管理系统接收该第一数据库表结构的信息中的其他信息即可。
本申请实施例假设第一数据元的数据元标识为数据标准库存储的多个标准数据元的数据元标识中的一个,则第一数据库表结构的信息和数据标准库所存储的标准数据元的 信息均包括数据元标识和数据元描述信息。数据标准管理系统可以通过第一数据元的数据元标识获取第一数据元的信息。其他数据元的信息的获取过程参考该第一数据元的信息的获取过程。
业务人员可以通过其使用的第二终端设备向数据标准管理系统输入第一数据元的数据元标识。相应的,数据标准管理系统接收第一数据元的数据元标识;由于第一数据元属于数据标准库存储的多个标准数据元,因此,该数据标准库中存储有该第一数据元的信息。则数据标准管理系统可以基于该第一数据元的数据元标识在数据标准库中查询,从而在数据标准库中获取第一数据元的数据元标识对应的数据元描述信息,从而基于第一数据元的数据元标识以及数据元描述信息得到第一数据元的信息。
在一种可选示例中,业务人员可以根据自身经验,自行输入第一数据元的数据元标识。例如,数据标准管理系统可以在用户界面提供输入框,业务人员在该输入框中输入第一数据元的数据元标识。
在另一种可选示例中,数据标准管理系统支持关键字搜索功能。第一数据元的数据元标识可以通过第一数据元对应的关键字在数据标准库存储的多个标准数据元的数据元标识中搜索得到。例如,数据标准管理系统可以在用户界面提供搜索框,业务人员可以在该搜索框输入第一数据元对应的关键字,由数据标准管理系统基于该关键字在数据标准库存储的多个标准数据元的数据元标识中搜索得到第一数据元的数据元标识。其中,关键字是用于索引第一数据元的数据元标识的字符,其可以由一个或多个字符组成,该关键字可以包括:英文字符、中文字符和/或数字字符(也称数值字符)。
可选的,数据标准管理系统还支持搜索提示功能,以保证业务人员有效地确定第一数据元的数据元标识。例如,在接收第一数据元的数据元标识之前,数据标准管理系统可以将接收的关键字与数据标准库存储的多个标准数据元的数据元标识进行匹配;并输出匹配结果,匹配结果包括至少一个第二标准数据元的信息,至少一个第二标准数据元中每个第二标准数据元的数据元标识均与关键字匹配。如此,业务人员可以在匹配结果中获取与关键字相关的各个第二标准数据元的信息,从而在至少一个第二标准数据元中选择自己想要的第一数据元。相应的,数据标准管理系统在检测到选择操作后,接收对应的第一数据元的数据元标识。
值得说明的是,前述匹配结果通常包括指定个数个第二标准数据元的信息,该指定个数为大于1的整数,如此可以提高匹配结果对业务人员可参考性。并且数据标准管理系统将接收的关键字与多个标准数据元的数据元标识进行匹配的算法可以为模糊匹配算法,例如ElasticSearch(简称ES)中的搜索算法。其中,模糊匹配算法指的是根据所提出的条件或者要求,给予一定精确程度的匹配。模糊匹配的原则是先搜索与被搜索的内容一模一样的内容,搜索不到再去搜索很接近的内容。在本申请实时中,模糊匹配算法还允许搜索用的关键字的部分字面顺序颠倒或有间隔。搜索内容可以包括关键字的同义词、近义词、相关词、以及包含关键字的短语等。
采用模糊匹配算法的得到的匹配结果既可以包括精确匹配的结果,又可以包括除精确匹配之外的结果,相较于单纯采用精确匹配算法来获取第二标准数据元的信息,模糊匹配所匹配的内容更加广泛,获取的第二标准数据元的信息的个数多,从而提高匹配结果对业务人员可参考性。其中,精确匹配算法指的是匹配条件是在搜索的关键字与标准 数据元的数据元标识二者字面完全一致时才确定匹配,匹配限制精确严格。本申请实施例提供的匹配的算法还可以为其他算法,对此不做限定。
值得说明的是,数据标准管理系统除了支持关键字搜索功能,还支持条件搜索功能。相应的搜索提示功能不仅包括关键字搜索功能对应结果,还包括条件搜索功能对应结果。
数据标准管理系统可以在用户界面提供条件输入框,业务人员可以在该条件输入框输入搜索条件,由数据标准管理系统基于该搜索条件在数据标准库存储的多个标准数据元的数据元标识中搜索得到符合搜索条件的数据元标识。进一步的,业务人员可以在进行关键字搜索的同时进行条件搜索,相应的,数据标准管理系统将符合搜索条件且数据元标识与关键字匹配的第二标准数据元的信息。如此可以提供给业务人员更准确的第二标准数据元的信息的推荐结果。减少业务人员在多个第二标准数据元的信息的查询时长,便于业务人员快速选择想要的标准数据元的信息。
其中,该搜索条件可以包括:发布时间、主管部门信息和/或标准类别等等。其中,发布时间指的是数据标准库中的标准数据元所属的数据标准的发布时间,如2018年;主管部门信息指的是数据标准库中的标准数据元所属的数据标准的管理者的信息,如某某电器工业协会;标准类别指的是数据标准库中的标准数据元所属的数据标准的类别,如安全类或产品类。
图4是本申请实施例中数据标准管理系统提供的一种示意性的界面示意图。假设业务人员在搜索框输入的关键字为:“整型”,在条件输入框(图4中该条件输入框用于输入发布时间)未输入内容,数据标准管理系统输出的匹配结果包括数据元标识(图4以数据元标识为数据元中文名称为例进行说明)为“整型测试字段”、“整型测试字段1”的2个第二标准数据元的信息,在检索到该2个第二标准数据元的信息的过程中没有进行发布时间的限制,业务人员基于匹配结果可以选择一个第二标准数据元作为第一数据元。
本申请实施例在实际实现时,匹配结果包括的信息还可以有其他形式。例如,匹配结果仅包括至少一个第二标准数据元的数据元标识;或者,匹配结果仅包括至少一个第二标准数据元的数据元描述信息;或者,匹配结果在包括前述至少一个第二标准数据元的信息的基础上,还包括至少一个第二标准数据元中每个第二标准数据元的数据元标识与关键字的匹配度,从而提高匹配结果对业务人员可参考性。
可选的,若匹配得到的第二标准数据元有多个时,在匹配结果中,多个第二标准数据元的信息可以按照第一指定顺序排列。例如,该第一指定顺序可以由以下两种示意性实现方式实现:
在第一种示意性实现方式中,多个第二标准数据元的信息按照第二标准数据元的数据元标识与关键字的匹配度降序排序(即按照匹配度从大到小的顺序排序)。
示例的,对于任一第二标准数据元,该第二标准数据元的数据元标识与关键字的匹配度的计算方式可以有多种,例如该匹配度P1满足第一匹配度计算公式:
P1=M/N;
其中,M为该第二标准数据元的数据元标识与关键字相同的字符数,相当于第二标准数据元的数据元标识与关键字交集所对应的字符数;N为该第二标准数据元的数据元标识的字符数与关键字的字符数中的最大字符数。例如,第二标准数据元的数据元标识为“整 型测试字段”,关键字为“整型”,则M=2,N=6,P1=1/3≈33.3%。
值得说明的是,前述采用第一匹配度计算公式计算匹配度只是本申请实施例提供的一种示意性的匹配度的获取方法,该匹配度的获取方法还可以有其他方式,可以参考传统的匹配度的获取方法,如模糊匹配算法(如ES)的匹配度的获取方法。
请继续参考图4,第二标准数据元的数据元标识为“整型测试字段1”时与关键字“整型”的匹配度P1=2/7≈28.6%。因此,图4中,按照第二标准数据元的数据元标识与关键字的匹配度降序排序后,数据标准管理系统输出的匹配结果中数据元标识为“整型测试字段”的第二标准数据元的信息排在数据元标识为“整型测试字段1”的第二标准数据元的信息之前。
在第二种示意性实现方式中,多个第二标准数据元的信息按照第二标准数据元所属的数据标准的优先级降序排序(即按照优先级从高到低的顺序排序)。
可选的,数据标准的优先级可以包括标准优先级或时间优先级。该标准优先级指的是标准自身的优先级。标准优先级从高到低的顺序排序依次是国标、行标和地标。时间优先级通常是发布时间或实施时间距离当前越近,优先级越高。
请参考图5,图5中按照第二标准数据元所属的数据标准的标准优先级降序排序后,数据标准管理系统输出的匹配结果中数据元标识为“整型测试字段1”的第二标准数据元的信息排在数据元标识为“整型测试字段”的第二标准数据元的信息之前。
值得说明的是,第一指定顺序还可以有其他方式,例如该第一指定顺序是将前述第一种和第二种示意性实现方式进行结合确定的顺序。也即是,多个第二标准数据元的信息按照第二标准数据元的数据元标识与关键字的匹配度以及第二标准数据元所属的数据标准的优先级排序。例如,对于每个第二标准数据元,数据标准管理系统可以获取该第二标准数据元的数据元标识与关键字的匹配度,以及获取该第二标准数据元所属的数据标准的优先级,并按照指定规则为该第二标准数据元所属的数据标准的优先级赋值,其中,优先级与所赋的数值正相关,也即是优先级越高,数值越高;接着,按照预先为匹配度和优先级分别分配的权值,基于该第二标准数据元对应的匹配度和优先级,通过加权求和的方式确定该第二标准数据元的排序指示值。最终,数据标准管理系统按照各个第二标准数据元对应的排序指示值进行各个第二标准数据元的信息的排序。通常按照排序指示值降序排序。
例如,假设某一第二标准数据元,其对应的匹配度为a,对应优先级的数值为b,预先为匹配度和优先级分别分配的权值分别为X和Y,则该第二标准数据元对应的排序指示值为c,c=aX+bY。
通过按照第一指定顺序对多个第二标准数据元的信息排序,可以提高对业务人员有效的提示,提高提示命中率。
在第二种可选方式中,通过接收线下已编辑的数据标准文档的方式获取第一数据库表结构的信息。
业务人员可以通过其使用的第二终端设备采用线下编辑的方式生成数据标准文档,数据标准文档包括第一数据库表结构的信息,并将生成的数据标准文档传输至数据标准管理系统。相应的,数据标准管理系统接收该数据标准文档。该数据标准文档可以为多种数据类型的文档,例如数据库表类型的数据文档。
在第一种可选实现方式中,业务人员可以在第二终端设备运行第三方建模工具,通过该第三方建模工具生成数据标准文档,然后将该数据标准文档输入数据标准管理系统。示例的,该第三方建模工具可以为E-Rwin或PowerDesigner等数据建模工具。
在第二种可选实现方式中,本申请实施例提供的数据标准管理系统,还可以支持第三方建模工具的接入,该第三方建模工具可以访问(如查询)数据标准库,获取数据标准库中存储标准数据元的信息,并基于标准数据元的信息来进行数据标准文档的生成。也即是,该第三方建模工具可以支持满足数据标准库要求的数据标准文档的生成。例如,业务人员可以在第二终端设备运行第三方建模工具,由第三方建模工具基于该数据标准库生成的数据标准文档。相应的,数据标准管理系统接收第三方建模工具基于数据标准文库生成的数据标准文档。示例的,该第三方建模工具可以为E-Rwin或PowerDesigner等数据建模工具。
前述第一种可选实现方式中,由于第三方建模工具的数据标准文档的生成规则与数据标准库所要求的生成规则不同,因此,数据标准管理系统获取第三方建模工具生成的数据标准文档后,需要对其中的第一数据库表结构的信息进行校验,得到符合数据标准库要求的第一数据库表结构的信息;而第二种可选方式中,由于第三方建模工具可以支持满足数据标准库要求的数据标准文档的生成,相应的第一数据库表结构的信息全部或部分符合数据标准库的要求,因此可以减少校验的运算代价,降低校验成本。
可选的,步骤201中,数据标准管理系统还可以输出数据字典模板,该数据字典模板为第一数据库表结构的信息的参考模板;业务人员可以基于该数据字典模板,通过第二终端设备向数据标准管理系统输入第一数据库表结构的信息,如单个或批量的数据元的信息,相应的,数据标准管理系统接收基于数据字典模板输入的第一数据库表结构的信息。其中,该第一数据库表结构的信息的具体获取方式可以参考前述取待上线的第一业务系统的第一数据库表结构的信息的第一种可选方式或第二种可选方式。
示例的,该数据字典模板可以如表1所示。
表1
Figure PCTCN2021075477-appb-000001
Figure PCTCN2021075477-appb-000002
请参考表1,该数据字典模板包括如表1第一行所示的数据元的信息所涉及的参数,表1中以涉及的参数包括数据库名称、物理表的英文名称、数据元的英文名称和数据元的中文名称等等参数为例进行说明;该数据字典模板还包括如表1第二行所示的数据元的信息所涉及的参数的解释信息(或称描述信息),该解释信息用于解释每个对应的参 数所表示的含义。例如,参数:可选值,对应的解释信息为:“数据的可选值及描述,例如性别:F,女性;M,男性”。该数据字典模板还包括如表1第三至九行所示的共6个数据元的信息(也即是每一行对应一个数据元的信息)的填写示例,用于提示业务人员如何填写数据元的信息。
前述表1仅为本申请实施例提供的示意性例子,本申请实施例在实际实现时,数据字典模板还可以有其他形式,只要能够达到为业务人员输入数据元的信息提供参考的目的即可。
数据标准管理系统通过输出数据字典模板,以供业务人员进行参考,使得业务人员不再单纯靠自身经验来制定数据元的信息,而是有所依据地制定数据元的信息,从而可以提高获取的第一数据库表结构的信息的准确性,减少后续校验过程的复杂度和运算代价。
步骤202、数据标准管理系统基于数据标准库对第一数据库表结构的信息进行校验。
由于第一数据库表结构是业务人员自定义的计划在第一业务系统使用的数据库表结构,数据标准管理系统获取第一数据库表结构的信息后,需要基于数据标准库对该信息进行校验。本申请实施例中,对第一数据库表结构的信息进行校验至少可以包括以下数据标准符号性校验和数据标准规范性校验共两方面:
第一方面,数据标准符号性校验。该校验过程指的是基于数据标准库中的标准数据元的信息对至少一个数据元的信息进行校验。标准数据元的信息即前述“符号”。
可选的,假设第一数据元为第一数据库表结构的信息所包括的至少一个数据元中的一个数据元,则基于数据标准库中的标准数据元的信息对至少一个数据元的信息的过程,包括:
步骤A1、数据标准管理系统将第一数据元的信息与数据标准库中存储的多个标准数据元的信息进行比较。
值得说明的是,数据标准库中通常以代码形式存储的标准数据元的信息,为了进行第一数据元的信息与数据标准库中存储的多个标准数据元的信息的有效匹配,通常先把标准数据元的信息翻译成可与第一数据元的信息进行比较的信息,例如文本形式的信息。
在本申请实施例中,数据标准管理系统将第一数据元的信息与数据标准库中存储的多个标准数据元的信息进行比较,主要是为了判定多个标准数据元的信息是否存在与第一数据元的信息匹配的信息。也即是判定每个标准数据元的信息与第一数据元的信息是否匹配。其中,判定一个标准数据元的信息与第一数据元的信息是否匹配取决于匹配条件。
在第一种可选方式中,该匹配条件为标准数据元的信息与第一数据元的信息中对应参数的信息均相同。如表1所示,即标准数据元的数据库名称、物理表的英文名称、数据元的英文名称和数据元的中文名称等等参数的信息与第一数据元的相应参数的信息对应相同。
如前所述,数据元的信息包括属性信息(如表1第一行中除可选值之外的参数的信息均属于属性信息);在一些场景中,还可能包括代码集(如表1第一行中参数:可选值的信息属于代码集)。由于第一数据库表结构的信息是业务人员自定义的,其中的至少一个数据元也是用户自定义的数据元,一些数据元的信息中可能并未定义代码集的信 息。因此,若匹配条件为标准数据元的信息与第一数据元的信息中对应参数的信息均相同,则数据标准库中存在与第一数据元的信息匹配的信息的概率较低,匹配效率较低。
在第二种可选方式中,该匹配条件为标准数据元的属性信息与第一数据元的属性信息中对应参数的信息均相同。如表1所示,即标准数据元的数据库名称、物理表的英文名称、数据元的英文名称和数据元的中文名称等等参数的信息(即除可选值之外的参数的信息)与第一数据元的相应参数的信息对应相同。数据标准管理系统将第一数据元的信息与多个标准数据元的信息进行比较的过程,包括:将第一数据元的属性信息与多个标准数据元的属性信息进行比较的过程。
与前述第一种可选方式同理,若匹配条件为标准数据元的属性信息与第一数据元的信息中对应参数的属性信息均相同,且标准数据元和第一数据元的属性信息涉及的参数较多,则数据标准库中存在与第一数据元的信息匹配的信息的概率较低,匹配效率较低。
在第三种可选方式中,该匹配条件为标准数据元的属性信息中指定参数的信息与第一数据元的属性信息中对应参数的信息均相同。该指定参数通常属于描述数据元自身属性的参数,例如名称、定义、结构和/或取值的规则等参数,其中,名称可以包括数据元的英文名称等参数,定义可以包括字符类型、字符长度和/或字符精度等参数。由于指定参数既能描述数据元的自身属性,参数的数量也较少,因此,采用该匹配条件,可以保证数据标准库中存在与第一数据元的信息匹配的信息的概率较高,提高匹配效率。
步骤A2、当第一数据元的信息与数据标准库中的多个标准数据元的信息均不匹配时,该数据标准管理系统发送第一修改提示信息,该第一修改提示信息指示更新第一数据元的信息。
示例的,该数据标准管理系统可以向第二终端设备发送第一修改提示信息,由第二终端设备向业务人员呈现该第一修改提示信息,以便业务人员更新第一数据元的信息。
步骤A3、数据标准管理系统在接收到与多个标准数据元中任一标准数据元的信息匹配的更新后的第一数据元的信息后,确定对第一数据元的信息校验成功。
数据标准管理系统在接收到更新后的第一数据元的信息后,将更新后的第一数据元的信息与多个标准数据元的信息进行比较,该比较步骤可以参考前述步骤A1;当第一数据元的信息与数据标准库中的多个标准数据元的信息均不匹配时,该数据标准管理系统再次发送第一修改提示信息数据元标识,该提示步骤可以参考前述步骤A2;数据标准管理系统再次接收到更新后的第一数据元的信息后,重复执行前述比较步骤、提示步骤和第一数据元的信息的接收步骤,直至数据标准库的多个标准数据元中任一标准数据元的信息与更新后的第一数据元的信息匹配。
当数据标准库的多个标准数据元中任一标准数据元的信息与更新后的第一数据元的信息匹配,数据标准管理系统确定对第一数据元的信息校验成功。
由上可知,业务人员可以通过数据标准管理系统多次发送的第一修改提示信息,实现第一数据元的信息的多次修改,以达到数据标准库中标准数据元的要求,使得业务人员可以定义出与数据标准库的标准数据元的信息一致的数据元的信息。
如前所述,第一数据库表结构的信息携带在可在线编辑或者离线的数据标准文档中,因此,业务人员在进行第一数据元的信息的更新时,通常是在该第一数据元所在第一数据库表结构的信息中更新的,也即是在数据标准文档中更新。数据标准管理系统需要先 在更新后的第一数据库表结构的信息中定位得到更新后的第一数据元的信息,再对更新后的第一数据元的信息进行校验。
在一种可选方式中,数据标准管理系统接收更新后的第一数据库表结构的信息后,在更新后的第一数据库表结构的信息的全量的数据元的信息中确定第一数据元的信息。示例的,数据标准管理系统可以扫描中第一数据库表结构的信息的全部的数据元,从而在全部的数据元的信息中定位到第一数据元的信息。该过程称之为全量校验。
在另一种可选方式中,数据标准管理系统接收更新后的第一数据库表结构的信息后;可以在更新后的第一数据库表结构的信息中确定增量的数据元,并在增量的数据元的信息中确定第一数据元的信息。该过程称之为增量校验。
可选的,数据标准管理系统建立有更新指示规则,业务人员在通过第二终端设备更新数据元的信息时,可以按照该更新指示规则进行数据元的信息的更新,以便数据标准管理系统有效定位到增量的数据元。
例如,数据标准管理系统在第一修改提示信息中携带第一数据库表结构的信息,并在该第一数据库表结构的信息中添加了备注字段,则数据标准管理系统发送第一修改提示信息的过程相当于进行了第一数据库表结构的信息的回退。若第一数据库表结构的信息携带在数据标准文档中,则实现了文档的回退。业务人员在通过第二终端设备接收到该第一数据库表结构的信息后,对其中的数据元的信息进行更新,并在备注字段添加目标备注信息,该目标备注信息指示进行了更新的数据元。之后,数据标准管理系统接收到更新后的第一数据库表结构的信息后,通过查询该备注字段即可确定进行了更新的数据元,即增量的数据元,进而在增量的数据元定位到第一数据元。例如,第一数据库表结构的信息包括6行数据元的信息,业务人员在通过第二终端设备接收到该第一数据库表结构的信息后,对第一行数据元和第三行数据元的信息进行更新,并在备注字段添加指示第一行数据元和第三行数据元的目标备注信息。则数据标准管理系统通过查询该备注字段即可确定第一行数据元和第三行数据元为增量的数据元。
又例如,数据标准管理系统在第一修改提示信息中携带第一数据库表结构的信息,并在该第一数据库表结构的信息中添加了插件。业务人员在通过第二终端设备接收到该第一数据库表结构的信息后,对其中的数据元的信息进行更新,该插件会自动标识更新的数据元。之后,数据标准管理系统接收到更新后的第一数据库表结构的信息后,通过插件的标识即可确定进行了更新的数据元,即增量的数据元,进而在增量的数据元定位到第一数据元。其中,插件可以通过添加批注、高亮和/或添加指定颜色等方式标识更新的数据元。例如,第一数据库表结构的信息包括6行数据元的信息,业务人员在通过第二终端设备接收到该第一数据库表结构的信息后,对第一行数据元和第三行数据元的信息进行更新,该插件对第一行数据元和第三行数据元的信息进行了高亮处理。则数据标准管理系统将进行了高亮处理的第一行数据元和第三行数据元确定为增量的数据元。
通过增量校验可以减少数据标准管理系统查询的数据元的数量,提高确定更新后的数据元的效率。
可选的,前述第一修改提示信息可以包括至少一个第一标准数据元的信息,该至少一个第一标准数据元中每个第一标准数据元的数据元标识均与第一数据元的数据元标识匹配。如此,业务人员可以在第一修改提示信息中获取与第一数据元的数据元标识相关 的各个第一标准数据元的信息,从而在至少一个第一标准数据元中选择自己想要修改得到的第一数据元。
值得说明的是,前述第一修改提示信息通常包括指定个数个第一标准数据元的信息,该指定个数为大于1的整数,如此可以提高第一修改提示信息对业务人员可参考性。并且数据标准管理系统将接收的第一数据元的数据元标识与多个标准数据元的数据元标识进行匹配的算法可以为模糊匹配算法,也即是,至少一个第一标准数据元中每个第一标准数据元的数据元标识均与第一数据元的数据元标识匹配。例如该模糊匹配算法为ElasticSearch中的搜索算法。
采用模糊匹配算法的得到的匹配结果既可以包括精确匹配的结果,又可以包括除精确匹配之外的结果,相较于单纯采用精确匹配算法来获取第一标准数据元的信息,模糊匹配所匹配的内容更加广泛,获取的第一标准数据元的信息的个数多,从而提高匹配结果对业务人员可参考性。该匹配的算法还可以为其他算法,本申请实施例对此不做限定。
可选的,若匹配得到的第一标准数据元有多个时,则在第一修改提示信息中,多个第一标准数据元的信息。可以按照第二指定顺序排列。例如,该第二指定顺序可以由以下两种示意性实现方式实现:
在第一种示意性实现方式中,多个第一标准数据元的信息按照第一标准数据元的数据元标识与关键字的匹配度降序排序(即按照匹配度从大到小的顺序排序)。
在第二种示意性实现方式中,多个第一标准数据元的信息按照第一标准数据元所属的数据标准的优先级降序排序(即按照优先级从高到低的顺序排序)。
值得说明的是,第二指定顺序还可以有其他方式,例如该第二指定顺序是将前述第一种和第二种示意性实现方式进行结合确定的顺序。也即是,多个第一标准数据元的信息按照第一标准数据元的数据元标识与第一数据元的数据元标识的匹配度以及第一标准数据元所属的数据标准的优先级排序。例如,对于每个第一标准数据元,数据标准管理系统可以获取该第一标准数据元的数据元标识与第一数据元的数据元标识的匹配度,以及获取该第一标准数据元所属的数据标准的优先级,并按照指定规则为该第一标准数据元所属的数据标准的优先级赋值,其中,优先级与所赋的数值正相关;接着,按照预先为匹配度和优先级分别分配的权值,基于该第一标准数据元对应的匹配度和优先级,通过加权求和的方式确定该第一标准数据元的排序指示值。最终,数据标准管理系统按照各个第一标准数据元对应的排序指示值进行各个第一标准数据元的信息的排序。通常按照排序指示值降序排序。
通过按照第二指定顺序对多个第一标准数据元的信息排序,可以提高对业务人员有效的提示,提高提示命中率。
需要说明的是,前述两种示意性实现方式以及该两种示意性实现方式结合的方式的具体过程可以参考前述步骤201中第一指定顺序所对应的两种示意性实现方式以及该两种示意性实现方式结合的方式,其中,步骤202与步骤201不同的是,第一数据元的数据元标识的内容与关键字的内容可能不同,但均包括一个或多个字符,本申请实施例对此不再赘述。
进一步可选的,在前述第一方面所提供的校验过程中,数据标准管理系统还可以根据校验情况添加一些备注字段,以提示业务人员需要注意的信息。
例如,假设第二数据元为第一数据库表结构的信息所包括的至少一个数据元中的一个数据元。当第二数据元对应的数值为可枚举数值时,为第二数据元添加数据元备注信息,数据元备注信息用于标识第二数据元对应的可枚举数值。请参考前述步骤A1,由于第一数据库表结构的信息是业务人员自定义的,有可能没有定义代码集的信息,又有可能定义的代码集的信息不准确。因此在本申请实施例中,可以在第二数据元对应的数值为可枚举数值时,为第二数据元添加数据元备注信息,以添加准确的代码集的信息。例如,第二数据元的名称为:年龄,为第二数据元添加数据元备注信息,该数据元备注信息用于标识第二数据元对应的1至120共120个可枚举数值。
如此,在后续第一数据库表结构的信息校验完成后,在采用基于该第一数据库表结构确定的目标数据库表结构中可以仍然包括为第二数据元添加数据元备注信息。在第一业务系统上线后,若需要采集该第二数据元对应的数据,可以直接按照第二数据元对应的可枚举数值的格式采集数据,以保证最终采集的数据符合数据标准库的格式要求,也即是符合相关标准。
第二方面,数据标准规范性校验。该校验过程指的是对数据标准的信息的规范性进行校验。主要校验数据标准的信息的格式。
可选的,基于数据标准库对第一数据库表结构的信息进行校验,数据标准库包括多个标准数据元的信息的过程,包括:
步骤B1、数据标准管理系统检测第一数据库表结构的信息的格式是否符合指定格式要求。
如步骤201所述,该第一数据库表结构的信息包括至少一个数据元的信息,还可以包括第一数据库的信息以及第一表信息。则数据标准管理系统检测第一数据库表结构的信息的格式是否符合指定格式要求的过程包括:检测每个数据元的信息的格式是否符合指定的数据元格式要求,如数据元的英文名称是否由指定字符(如大写英文字母)组成;检测第一数据库的信息的格式是否符合指定的数据库格式要求,如数据库的英文名称由指定字符(如大写英文字母)组成,数据库编码的长度是否小于第一指定长度阈值,该第一指定长度阈值可以为60位(位指的是数值的位数);检测第一表信息的格式是否符合指定的数据库表结构式要求,如数据库表标识是否由指定字符(如大写英文字母)组成,数据库表编码的长度是否小于第二指定长度阈值,该第二指定长度阈值可以为60位。
步骤B2、当第一数据库表结构的信息的格式不符合指定格式要求时,数据标准管理系统发送第二修改提示信息,该第二修改提示信息指示更新第一数据库表结构的信息的格式。
示例的,该数据标准管理系统可以向第二终端设备发送第二修改提示信息,由第二终端设备向业务人员呈现该第二修改提示信息,以便业务人员更新第一数据库表结构的信息的格式。
步骤B3、数据标准管理系统在接收到格式符合格式要求的更新后的第一数据库表结构的信息后,确定对第一数据库表结构的信息的格式校验成功。
数据标准管理系统在接收到更新后的第一数据库表结构的信息后,检测该第一数据库表结构的信息的格式是否符合指定格式要求,该检测步骤可以参考前述步骤B1;当第一数据库表结构的信息的格式仍然不符合指定格式要求,该数据标准管理系统再次发送 第二修改提示信息,该提示步骤可以参考前述步骤B2;数据标准管理系统再次接收到更新后的第一数据库表结构的信息后,重复执行前述检测步骤、提示步骤和第一数据库表结构的信息的接收步骤,直至第一数据库表结构的信息的格式符合指定格式要求。
当更新后的第一数据库表结构的信息的格式符合指定格式要求,数据标准管理系统确定对第一数据库表结构的信息的格式校验成功。
由上可知,业务人员可以通数据标准管理系统多次发送的第二修改提示信息,实现第一数据库表结构的信息的格式的多次修改,以达到数据标准管理系统对标准信息的格式的要求,使得业务人员可以定义出与符合要求的数据库表结构的信息。
值得说明的是,前述第二修改提示信息通常会指示出第一数据库表结构的信息中具体哪个信息不符合指定格式要求,如数据元的信息或者第一数据库的信息或者第一表信息不符合对应的格式要求。
在一种可选方式中,在接收到更新后的第一数据库表结构的信息后,检测第一数据库表结构中的所有的信息是否符合指定格式要求。示例的,数据标准管理系统可以扫描中第一数据库表结构的信息的全部的信息,并检测扫描到的信息是否符合对应的格式要求。
在另一种可选方式中,在接收到更新后的第一数据库表结构的信息后,检测上次不符合对应的格式要求的信息,无需检测第一数据库表结构中的所有的信息。通过仅检测上次不符合对应的格式要求的信息可以减少数据标准管理系统查询的信息量,提高检测效率。
值得说明的是,前述两方面的校验过程可以同时执行,也可以分别执行,本申请实施例第执行的先后顺序不做限定。
通过前述步骤202的校验过程,可以利用针对第一业务系统创建的第一数据库表结构的信息,如第一表信息、数据元的信息及代码集等进行自动匹配校验,系统管理员只需进行的简单检查,极大节省了人力投入,并且由于人工审核环节的减少,可以降低人工误差,提高最终得到的目标数据库表结构的准确性。相较于传统技术,可节省75%以上的人力成本。并且由原始通过数据库表在线下进行填写、传递、比对、填写审核意见、反馈、修改、再审核等流程,简化为自动校验、比对修改、提交审核、审核结果反馈的过程,减少了线下的信息传递过程及环节,节约了信息传递时间,进一步简化工作流程,提升了工作效率,工作效率通常可以提升70%以上。
步骤203、数据标准管理系统在对第一数据库表结构的信息校验后,获取第一业务系统的目标数据库表结构,该目标数据库表结构基于第一数据库表结构确定。
该目标数据库表结构为第一业务系统待在上线后采用的数据库表结构,其实质上是在第一业务上线前预先获取的,只是该目标数据库表结构是在第一业务上线后需要使用的。目标数据库表结构的信息包括第一数据库表结构的信息的部分或全部。在一种可选方式中,可以直接将第一数据库表结构的信息确定为目标数据库表结构的信息。
在另一种可选方式中,第一数据库表结构的信息还需要进一步调整,以得到目标数据库表结构的信息。
由于对于不同时期,不同业务系统的数据元可能会根据实际情况有一些变动,虽然数据标准库中没有记录对应的标准数据元的信息,但是针对某一业务系统,该数据元的 信息是允许添加在该业务系统中的。然而这些数据元的信息在数据标准管理系统是无法有效校验的,因此,在前述步骤202之后,即数据标准管理系统对第一数据库表结构的信息进行自动校验后,数据标准管理系统还支持对第一数据库表结构的信息的人工的二次校验。
示例的,该二次校验过程可以包括:
与前述第一方面对应的,数据标准管理系统在确定对第一数据元的信息校验成功后,向指定终端设备发送该第一数据元的信息。该指定终端设备是用于进行二次校验的校验人员的终端设备,其可以为系统管理员或者项目管理员的终端设备。例如,假设校验人员为系统管理员,则指定终端设备为前述第三终端设备。校验人员在通过指定终端设备接收到第一数据元的信息后,判定第一数据元的信息是否需要修改,并基于判定结果通过指定终端设备向数据标准管理系统发送第一校验响应信息,该第一校验响应信息用于指示对第一数据元的信息进行修改,或者指示对第一数据元的信息二次校验成功。该数据标准管理系统接收第一校验响应信息,并将该第一校验响应信息发送至业务人员的第二终端设备。
当该第一校验响应信息用于指示对第一数据元的信息进行修改,业务人员可以通过第二终端设备对第一数据元的信息进行修改,再次通过数据标准管理系统发送至指定终端设备,由校验人员进行校验,直至第二终端设备接收到的第一校验响应信息指示对第一数据元的信息二次校验成功。
当该第一校验响应信息用于指示对第一数据元的信息二次校验成功,业务人员无需再修改第一数据元的信息。
与前述第二方面对应的,数据标准管理系统在确定对第一数据库表结构的信息的格式校验成功后,向指定终端设备发送该第一数据库表结构的信息。校验人员在通过指定终端设备接收到第一数据库表结构的信息后,判定第一数据库表结构的信息的格式是否需要修改,并基于判定结果通过指定终端设备向数据标准管理系统发送第二校验响应信息,该第二校验响应信息用于指示对第一数据库表结构的信息的格式进行修改,或者指示对第一数据库表结构的信息的格式二次校验成功。该数据标准管理系统接收第二校验响应信息,并将该第二校验响应信息发送给业务人员的第二终端设备。
当该第二校验响应信息用于指示对第一数据库表结构的信息的格式进行修改,业务人员可以通过第二终端设备对第一数据元的信息的格式进行修改,再次通过数据标准管理系统发送至指定终端设备,由校验人员进行校验,直至第二终端设备接收到的第二校验响应信息指示对第一数据元的信息的格式二次校验成功。
当该第二校验响应信息用于指示对第一数据元的信息的格式二次校验成功,业务人员无需再修改第一数据元的信息的格式。
值得说明的是,第一数据元的信息和第一数据库表结构的信息可以携带在同一校验请求(也称校验申请)中发送给指定终端设备,第一校验响应信息和第二校验响应信息可以是同一信息,以减少数据标准管理系统与各个终端设备交互的次数,节约网络开销。
通过对第一数据库表结构的信息进行人工二次校验,可以保证最终获取的目标数据库表结构的信息的灵活性和可靠性。
可选的,目标数据库表结构的信息可以为人工二次校验成功后的第一数据库表结构 的信息。
可选的,前述校验成功和/或人工二次校验成功后,还可以通过发送第三修改提示信息,提示业务人员进一步对第一数据库表结构的信息进行调整,得到目标数据库表结构的信息。示例的,最终将第一数据库表结构的信息满足落标率的要求和/或匹配率的要求的第一数据库表结构确定为目标数据库表结构。
其中,落标率的要求指的是该第一数据库表结构的信息的落标率大于指定落标率阈值。其中,落标率为第一数据库表结构的信息中实际落标数据元的数量与第一数据库表结构的信息中应落标数据元的数量的比值。实际落标数据元为与标准数据元匹配的数据元,该匹配的定义可以参考前述步骤A1的定义。应落标数据元为业务标识与标准数据元的业务标识(如数据元的英文名称)相同,但与标准数据元不匹配的数据元(即仅满足匹配条件中的业务标识相同的这一条件)。例如,第一数据库表结构的信息中实际落标数据元的数量为5个,应落标数据元的数量为10,则匹配率为5/10=50%。
匹配率的要求指的是该第一数据库表结构的信息的匹配率大于指定匹配率阈值。其中,匹配率为:应落标数据元的数量与第一业务系统的数据库表结构所包含的数据元的总数。
如前所述,第一业务系统的数据库表结构可以有一个或多个,该第一业务系统的数据库表结构所包含的数据元的总数为该第一业务系统中所有数据库表结构所包含的数据元的总个数。例如,第一业务系统的数据库表结构共3个,数据元总数为30个,前述第一数据库表结构中的应落标数据元的数量为6,则匹配率为6/30=20%。
步骤204、数据标准管理系统输出目标数据库表结构对应的脚本。
本申请实施例中,数据标准管理系统支持脚本输出功能。在对第一数据库表结构的信息进行校验和审核,得到了目标数据库表结构的信息后,数据标准管理系统可以生成并输出目标数据库表结构对应的脚本(也称建表脚本)。该脚本用于生成目标数据库表结构的信息,其包括第一业务系统的数据库建表语句。业务人员通过第二终端设备可以接收该脚本。在第一业务系统在上线后,业务人员可以在第一业务系统中加载并运行该脚本,运行后的脚本可以生成目标数据库表结构的信息,并按照该目标数据库表结构的信息建设得到对应的数据库表结构。
本申请实施例中,数据标准管理系统输出脚本,无需业务人员自行编写脚本,减少业务人员的工作量,从而节约人工成本。
可选的,在确定目标数据库表结构,数据标准管理系统还可以向项目管理员的第一终端设备发送目标数据库表结构使用请求,该请求中携带目标数据库表结构的信息,项目管理员可以基于该目标数据库表结构的信息在第一业务系统在上线后,建立数据库表。并在建设完成后,通过数据标准管理系统向业务人员的第二终端设备发送数据库表使用通知,以通知业务人员该目标数据库表结构在第一业务系统上线后可以开始使用。如此业务人员可以无需进行数据库表的建设。
其中,前述步骤201至步骤204可以由图1所示的数据标准管理系统10中的数据标准管理设备102,其中,步骤201至步骤203对应数据校验功能;步骤204对应脚本输出功能。可选的,该数据标准管理系统10还支持后续步骤205至步骤206对应的生命周期管理功能,以及步骤207所示的数据治理功能。其中,生命周期管理功能由生命周期管 理设备103执行,数据治理功能由数据治理设备104执行。其中,步骤205至步骤206如下:
步骤205、数据标准管理系统接收数据标准库操作请求,数据标准库操作请求包括标准数据元添加请求、标准数据元更新请求、标准数据元删除请求或标准数据元查询请求。
在本申请实施例中,支持对数据标准库的多种数据标准库操作,如标准数据元添加操作、标准数据元更新操作、标准数据元删除操作和标准数据元查询操作。对应的数据标准库操作请求分别为标准数据元添加请求、标准数据元更新请求、标准数据元删除请求或标准数据元查询请求。
其中,标准数据元添加请求用于请求在数据标准库中添加一个或多个数据元的信息。示例的,系统管理员可以收集国标、行标或地标,将收集的数据标准拆分成多个数据元,通过一个或多个标准数据元添加请求向数据标准库添加该多个数据元的信息。每个标准数据元添加请求可以携带单个或批量的数据元的信息。例如,某一地标针对一新建业务系统添加了部分数据元,则需要将该部分数据元加入到数据标准库中,用于指导行业或下属单位的业务系统建设,统一数据标准,则系统管理员通过标准数据元添加请求添加该部分数据元。又例如,国标或行业标准新增了一个标准数据元,则系统管理员通过标准数据元添加请求添加该数据元,以与国标或行标保持同步。
值得说明的是,系统管理员可以基于数据标准管理系统来收集数据元。在一种可选方式中,数据标准管理系统还可以输出数据元收集模板,该数据元收集模板是用于进行数据元收集的参考模板,其结构可以参考前述数据字典模板;业务人员可以基于该数据元收集模板,通过第二终端设备向数据标准管理系统输入数据元的信息,如单个或批量的数据元的信息。相应的,数据标准管理系统接收到该数据元的信息后,将接收的数据元的信息发送至系统管理员的第三终端设备中,以供系统管理员参考。示例的,业务人员可以直接通过第二终端设备在该数据元收集模板输入数据元,得到更新后的数据元收集模板,数据标准管理系统将该更新后的数据元收集模板发送至系统管理员的第三终端设备。
标准数据元更新请求用于请求在数据标准库中更新一个或多个数据元的信息。例如,国标或行业标准更新了一个标准数据元,则系统管理员通过标准数据元更新请求更新该数据元,以与国标或行标保持同步。
标准数据元删除请求用于请求删除数据标准库中的一个或多个数据元的信息。例如、某一国标、行标或地标对应的数据标准停止使用时,系统管理员通过一个或多个标准数据元删除请求删除数据标准库中该数据标准对应的多个数据元的信息。可选的,数据标准管理系统在后续过程中对数据标准库执行某一标准数据元的删除操作时,可以在该标准数据元上添加删除标志,并不将该标准数据元物理上删除,以便后续查询。该删除标志指示该标准数据元已废弃,其还可以携带该标准数据元的废弃理由,如对于数据标准已停止使用。
标准数据元查询请求用于请求查询数据标准库中的数据元的信息。
值得说明的是,在添加数据元时,数据标准管理系统还可以输出数据元模板,该数据元模板为添加的数据元的参考模板;操作人员可以基于该数据元模板,通过对应的终端设备向数据标准管理系统输入标准数据元添加请求中所需携带的数据元的信息,相应 的,数据标准管理系统接收基于数据元模板输入的数据元的信息。其中,数据元模板可以如表2所示。
表2
Figure PCTCN2021075477-appb-000003
请参考表2,该数据元模板包括如表2第一行所示的数据元的信息所涉及的参数,表2中以涉及的参数包括基础数据分类名称、基础数据分类编码、数据元的标识符和数据元的中文名称等等参数为例进行说明;该数据元模板还包括如表2第二行所示的数据元的信息所涉及的参数的解释信息(或称描述信息),该解释信息用于解释每个对应的参数所表示的含义。例如,参数:基础数据分类名称,对应的解释信息为:“数据元所属的分类,取值如下:XX公共:XX基础;司局公共:XX司局_公共信息;司局系统:XX司局_系统名称”。该数据元模板还包括如表2第三至五行所示的共3个数据元的信息(也即是每一行对应一个数据元的信息)的填写示例,用于提示操作人员如何填写数据元的信息。
前述表2仅为本申请实施例提供的示意性例子,本申请实施例在实际实现时,数据元模板还可以有其他形式,只要能够达到为操作人员输入数据元的信息提供参考的目的即可。
数据标准管理系统通过输出数据元模板,以供操作人员进行参考,使得操作人员不再单纯靠自身经验来制定数据元的信息,而是有所依据地制定数据元的信息,从而可以提高输入至标准数据库的标准数据元的信息的准确性。
步骤206、数据标准管理系统在对数据标准库操作请求鉴权成功后,对数据标准库执行数据标准库操作请求所对应的操作。
由于数据标准库中存储有多个标准数据元的信息,这些标准数据元的信息是进行各个业务系统的数据库表结构校验的基础,若标准数据元的信息随意被增加、删除或修改,会造成数据标准库中信息的管理混乱,导致数据标准库的信息失去校验的有效参考性。因此,在对数据标准库进行数据标准库操作时,需要对数据标准库操作请求进行鉴权,在鉴权成功后,数据标准管理系统才能对数据标准库执行数据标准库操作请求所对应的操作。
如前所述,数据处理方法所涉及的应用场景中,涉及的工作人员包括项目管理员、业务人员和系统管理员。这些工作人员均需要在数据标准管理系统中注册相应的账号,在数据标准管理系统进行操作时,相应的操作信息中携带该账号,以标识操作人员的身份。不同身份的工作人员的账号等级不同,对数据标准管理系统的操作权限也不同,因此对应的鉴权方式也不同。
另外,由于数据标准库操作的类型有多种,不同类型的数据操作所涉及的保密级别不同,因此对应的鉴权方式也不同。
当数据操作是标准数据元添加操作、标准数据元更新操作或标准数据元删除操作时,数据操作所涉及的保密级别较高。通常的业务人员是不允许进行相关操作的。又由于数据标准管理系统由系统管理员维护,通常系统管理员具有管理数据标准库的权限,其可以执行保密级别较高的操作。但是系统管理员并不一定是对应项目的专业人员,所以可能对项目所对应的数据标准不了解,因此,还需要相关人员来进行数据操作的辅助鉴权。当数据操作是数据元查询操作时,数据操作所涉及的保密级别较低,项目管理员、业务人员和系统管理员通常是可以查看的。但是为了避免一些恶意访问,减少泄密,也可以添加相关人员来进行数据操作的辅助鉴权。
基于上述原理,本申请实施例示意性地提出以下几种鉴权方式:
第一种鉴权方式:高保密级别的数据操作的鉴权。
当数据操作请求是标准数据元添加请求、标准数据元更新请求或标准数据元删除请求时,数据标准管理系统检测该数据标准请求中携带的账号是否为第一等级的账号,该第一等级大于指定等级阈值,示例的,该第一等级的账号为系统管理员的账号。
当数据标准管理系统检测该数据标准请求中携带的账号不为第一等级的账号时,确定对数据标准库操作请求鉴权失败。
当数据标准管理系统检测到该数据标准请求中携带的账号为第一等级的账号时,在一种可选方式中,数据标准管理系统确定对数据标准库操作请求鉴权成功;在另一种可选方式中,数据标准管理系统向第二等级的账号所对应的终端设备发送该数据标准库操作请求,在接收到指示允许对数据标准库进行操作的允许指令后,确定对数据标准库操作请求鉴权成功;在接收到指示不允许对数据标准库进行操作的禁止指令后,确定对数据标准库操作请求鉴权失败。其中,第二等级高于或等于第一等级,该第二等级的账号与第一等级的账号不同。例如,第二等级的账号为项目管理员的账号,则相应的第二等级的账号所对应的终端设备为前述第一终端设备。项目管理员在通过第一终端设备接收到数据标准库操作请求后,根据请求的内容以及请求中携带的账号确定是否允许对应的人员对数据标准库进行操作,若允许对数据标准库进行操作,则通过第一终端设备发送允许指令,若不允许对数据标准库进行操作,则通过第一终端设备发送禁止指令。
第二种鉴权方式:低保密级别的数据操作的鉴权。
当数据操作请求是标准数据元查询请求时,数据标准管理系统检测该数据标准请求中携带的账号是否为数据标准管理系统分配的账号,也即是数据标准管理系统中的合法账号,示例的,该数据标准管理系统分配的账号为项目管理员、业务人员以及系统管理员的账号中的任一账号。
当数据标准管理系统检测该数据标准请求中携带的账号不为数据标准管理系统分配的账号时,确定对数据标准库操作请求鉴权失败。
当数据标准管理系统检测到该数据标准请求中携带的账号为数据标准管理系统分配的账号时,在一种可选方式中,数据标准管理系统确定对数据标准库操作请求鉴权成功;在另一种可选方式中,数据标准管理系统向第三等级的账号所对应的终端设备发送该数据标准库操作请求,在接收到指示允许对数据标准库进行操作的允许指令后,确定对数据标准库操作请求鉴权成功;在接收到指示不允许对数据标准库进行操作的禁止指令后,确定对数据标准库操作请求鉴权失败。
其中,第三等级高于或等于前述数据标准请求中携带的账号的等级,该第三等级的账号与数据标准请求中携带的账号不同。例如,第三等级的账号为项目管理员或系统管理员的账号,假设第三等级的账号为系统管理员的账号,则相应的第三等级的账号所对应的终端设备为前述第三终端设备。系统管理员在通过第三终端设备接收到数据标准库操作请求后,根据请求的内容以及请求中携带的账号确定是否允许对应的人员对数据标准库进行操作,若允许对数据标准库进行操作,则通过第三终端设备发送允许指令,若不允许对数据标准库进行操作,则通过第三终端设备发送禁止指令。
在前述两种鉴权方式中,若数据标准管理系统对数据标准库操作请求鉴权成功,对 数据标准库执行数据标准库操作请求所对应的操作。可选的,据标准管理系统还可以发送数据操作响应,指示完成对数据标准库执行数据标准库操作请求所对应的操作,或者指示操作成功。
若数据标准管理系统对数据标准库操作请求鉴权失败,禁止对数据标准库执行数据标准库操作请求所对应的操作。可选的,据标准管理系统还可以发送数据操作响应,指示禁止对数据标准库执行数据标准库操作请求所对应的操作,或者指示操作失败。
可选的,数据标准管理系统的数据标准库操作还包括对代码集的操作,如代码集添加操作、代码集更新操作、代码集删除操作和代码集查询操作。对应的数据标准库操作请求分别为代码集添加请求、代码集更新请求、代码集删除请求或代码集查询请求。相应的过程可以参考前述标准数据元的添加、更新、删除和查询操作的过程,本申请实施例对此不做赘述。
通过对数据标准库操作请求进行鉴权,可以保证数据标准库中数据的安全,保证数据标准库中标准数据元的信息的可靠性,有效防止泄密。
数据标准管理系统支持针对数据元的添加(也称发布)、更新、删除和查询,还可以支持代码集的维护、数据元的审核和其他维护等功能。值得说明的是,该数据标准管理系统还可以以数据库表或文档的形式管理多个数据标准的信息,以供使用人员查阅和参考。这些针对数据标准或数据元的一系列的功能,可以称之为数据元的生命周期管理。基于生成周期管理可以实现数据标准的全流程全方位的管控。本申请实施例中,数据标准管理系统还可以对数据标准对应的文档,进行系统性管理,例如以网页形式分行业,分级别,分版本建立不同的页面,以便于使用人员查阅使用。
可选的,在前述生成周期管理的过程中,数据标准管理系统针对每个数据操作均可以生成操作日志。数据标准管理系统可以在接收到查询指令后,或者周期性地查询数据标准库对应的操作日志;当操作日志中包括异常操作日志,发出异常告警。系统管理员可以基于异常告警确定是否需要进行数据操作的回溯,实现数据的回滚,从而有效维护数据标准管理系统。
步骤207、数据标准管理系统基于人工智能模型进行数据治理。
本申请实施例中,数据标准管理系统还支持数据治理功能。
在一种可选示例中,该数据治理功能对应的数据治理过程可以参考传统的数据治理过程。
在另一种可选示例中,数据标准管理系统预先建立有人工智能模型,通过人工智能模型进行数据治理。该数据治理过程可以包括以下步骤:
步骤C1、数据标准管理系统获取上线后的第二业务系统的第二数据库表结构的信息,该第二数据库表结构的信息包括至少一个数据元的信息。
步骤C1的过程可以参考前述步骤201的过程,本申请实施例对此不做限定。
步骤C2、数据标准管理系统在基于人工智能模型检测到第三数据元的信息与多个标准数据元的信息均不匹配时,在多个标准数据元中确定与第三数据元对应的目标标准数据元,第三数据元为第二数据库表结构的信息包括的至少一个数据元的一个数据元。
数据标准管理系统可以将第二数据库表结构的信息输入人工智能模型,由人工智能模型检测第三数据元的信息与多个标准数据元的信息是否匹配,并在第三数据元的信息 与多个标准数据元的信息均不匹配时,在多个标准数据元中确定与第三数据元对应的目标标准数据元。
步骤C3、数据标准管理系统基于人工智能模型,建立第三数据元与目标标准数据元的映射关系。
人工智能模型建立第三数据元与目标标准数据元的映射关系后,可以输出该映射关系,以供后续提供数据服务时使用。
其中,该人工智能模型可以由多个标准数据元的信息以及样本数据元的信息训练得到。通过采用人工智能模型来建立映射关系,可以提高映射关系的建立准确度和建立效率,从而提高数据治理的效果。
本申请实施例提供的业务数据处理方法步骤的先后顺序可以进行适当调整,步骤也可以根据情况进行相应增减,例如步骤206和207可以位于步骤201之前,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化的方法,都应涵盖在本申请的保护范围之内,因此不再赘述。
随着传统业务系统云化,微服务化或某种原因业务系统新建的推进,必然会涉及到数据库表和数据元的新建;原有的业务系统可能是多个厂家,不同时期建设的,各个厂家参照的数据标准和数据库的规范性要求也不同,例如:数据元标准,代码集标准等等;传统的数据标准管理系统是在某一业务系统上线后,对该业务系统中维护的数据库表结构进行稽查;若该数据库表结构不符合数据标准管理系统所存储的与该业务系统对应的目标数据标准的要求,数据标准管理系统会建立该数据库表结构与目标数据标准的映射关系。该过程称为数据治理过程。
而本申请实施例是业务系统云化/微服务化/新建时,在业务系统上线前,对其数据库表结构进行校验。在本申请实施例中,业务系统上线可以为首次上线(即新建后上线),也可以为改造后上线。通过前述数据标准符号性校验和数据标准规范性校验,可以实现业务系统中数据标准和数据库设计等的统一要求,让各个业务厂家按照统一要求建设,达到数据理解一致,数据标准一致,数据库规范一致,以提升数据质量。
为了便于读者理解,本申请实施例示意性提供一种业务系统改造后上线的场景。该场景是一种平台即服务(Platform as a Service,PaaS)场景。假设一个项目需要将该项目对应的多个业务系统进行业务系统微服务化改造。业务系统需要按照微服务拆分新建。业务系统新建时会涉及到数据库表结构的新建。由于老的业务系统可能是多个厂家,不同时期建设的,厂家对数据库表结构的设计也不一致。如此采用本申请实施例提供的数据处理方法可以实现所有厂家的业务系统在上线前定义数据库表结构。从而实现对数据库表结构按照数据标准库的统一要求建设的目的,基于该数据标准管理系统还可以实现统一标准的落地审核,监控落地,同时根据业务人员和标准管理员的需要可以对内或对外发布数据标准。
进一步的,为了便于读者理解,本申请实施例对前述数据库表结构的处理方法的实际实现流程进行示意性说明。首先第一业务系统的业务人员A在数据标准管理系统申请账号,该数据标准管理系统将该申请推送给系统管理员,由系统管理员审批,若系统管理员审批通过,说明允许业务人员A对第一业务系统进行操作。业务人员A可以在第一业务系统中新建数据库、编辑数据库的架构和/或删除数据库。业务人员A在编辑数据库 的过程中,可以通过在线编辑或者线下编辑的方式(包括导入数据库表或者导入数据元的信息等)建立数据库表。在编辑数据库表的过程中,可以新建数据元、编辑数据元或者删除数据元。在新建数据元或者编辑数据元的过程中,基于数据标准管理系统的校验进行数据元的调整,之后还可以基于人工校验进行数据库表和数据元的调整,例如由系统管理员校验数据库表和数据元。在人工校验通过后,进行数据库表的落标,得到目标数据库表结构。最终业务人员A可以基于数据标准管理系统所提供的目标数据库表结构对应的脚本在第一业务系统的数据库中构建数据库表。其中,各个环节可以参考前述步骤中的解释,本申请实施例对此不再赘述。
综上所述,本申请实施例提供的业务数据处理方法,在业务系统上线前,对该业务系统的数据库表结构的信息进行校验,从而保证业务系统在上线后可以采用准确的目标数据库表结构。相较于传统技术,目标数据库表结构的可靠性较高,从而提高了业务系统上线后提供的数据的质量,减少了业务系统上线后数据转化的概率,降低了后期数据治理的成本。
本申请实施例提供一种数据标准管理系统,如图6所示,该数据标准管理系统包括:
第一获取模块301,用于获取待上线的第一业务系统的第一数据库表结构的信息,该第一数据库表结构的信息包括至少一个数据元的信息;校验模块302,用于基于数据标准库对该第一数据库表结构的信息进行校验,该数据标准库包括多个标准数据元的信息;第二获取模块303,用于在对该第一数据库表结构的信息校验后,获取该第一业务系统的目标数据库表结构,该目标数据库表结构基于校验后的该第一数据库表结构确定。
综上所述,本申请实施例提供的数据标准管理系统,在业务系统上线前,由校验模块对该业务系统的数据库表结构的信息进行校验,从而保证业务系统在上线后可以采用准确的目标数据库表结构。相较于传统技术,目标数据库表结构的可靠性较高,从而提高了业务系统上线后提供的数据的质量,减少了业务系统上线后数据转化的概率,降低了后期数据治理的成本。
可选的,该校验模块302,用于:当第一数据元的信息与该多个标准数据元的信息均不匹配时,发送第一修改提示信息,该第一修改提示信息指示更新该第一数据元的信息,该第一数据元为该至少一个数据元中的一个数据元;在接收到与该多个标准数据元中任一标准数据元的信息匹配的更新后的该第一数据元的信息后,确定对该第一数据元的信息校验成功。
可选的,该第一数据库表结构的信息和该标准数据元的信息均包括数据元标识,该第一修改提示信息包括至少一个第一标准数据元的信息,该至少一个第一标准数据元中每个第一标准数据元的数据元标识均与该第一数据元的数据元标识模糊匹配。
可选的,该第一修改提示信息包括多个该第一标准数据元的信息,多个该第一标准数据元的信息按照第一标准数据元的数据元标识与该第一数据元的数据元标识的匹配度降序排序;和/或,按照第一标准数据元所属的数据标准的优先级降序排序。
可选的,该数据标准管理系统还包括:第一接收模块,用于接收更新后的该第一数据库表结构的信息;
第一确定模块,用于:在该更新后的该第一数据库表结构的信息中确定增量的数据 元,并在该增量的数据元的信息中确定该第一数据元的信息;或者,在该更新后的该第一数据库表结构的信息的全量的数据元的信息中确定该第一数据元的信息。
可选的,该校验模块302,用于:当第一数据库表结构的信息的格式不符合指定格式要求时,发送第二修改提示信息,该第二修改提示信息指示更新该第一数据库表结构的信息的格式;在接收到格式符合该格式要求的更新后的该第一数据库表结构的信息后,确定对该第一数据库表结构的信息的格式校验成功。
可选的,该数据标准管理系统还包括:
备注模块,用于在该获取待上线的第一业务系统的第一数据库表结构的信息之后,当第二数据元对应的数值为可枚举数值时,为该第二数据元添加数据元备注信息,该数据元备注信息用于标识该第二数据元对应的可枚举数值,该第二数据元为该至少一个数据元中的一个数据元。
可选的,该第一数据库表结构的信息和该标准数据元的信息均包括数据元标识和数据元描述信息,该第一获取模块301,用于:接收该第一数据元的数据元标识,该第一数据元的数据元标识为该数据标准库存储的多个标准数据元的数据元标识中的一个;在该数据标准库中获取该第一数据元的数据元标识对应的数据元描述信息。
可选的,该第一获取模块301,用于:接收数据标准文档,该数据标准文档包括该第一数据库表结构的信息。
可选的,该第一获取模块301,用于:接收第三方建模工具基于该数据标准库生成的该数据标准文档。
可选的,该第一获取模块301,用于:输出数据字典模板,该数据字典模板为该第一数据库表结构的信息的参考模板;
接收基于该数据字典模板输入的该第一数据库表结构的信息。
可选的,该数据标准管理系统还包括:第二接收模块,用于接收数据标准库操作请求,该数据标准库操作请求包括标准数据元添加请求、标准数据元更新请求、标准数据元删除请求或标准数据元查询请求;操作模块,用于在对该数据标准库操作请求鉴权成功后,对该数据标准库执行该数据标准库操作请求所对应的操作。
可选的,该数据标准管理系统还包括:第三获取模块,用于获取上线后的第二业务系统的第二数据库表结构的信息,该第二数据库表结构的信息包括至少一个数据元的信息;第二确定模块,用于当基于人工智能模型检测到第三数据元的信息与该多个标准数据元的信息均不匹配时,在该多个标准数据元中确定与该第三数据元对应的目标标准数据元,该第三数据元为该第二数据库表结构的信息包括的至少一个数据元的一个数据元;建立模块,用于基于该人工智能模型,建立该第三数据元与该目标标准数据元的映射关系。
可选的,该数据标准管理系统还包括:
输出模块,用于在该获取该第一业务系统的目标数据库表结构后,输出该目标数据库表结构对应的脚本。
需要说明的是,上述实施例提供的数据标准管理系统在进行数据库表结构处理时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将设备的内部结构划分成不同的功能模块,以完成以上描 述的全部或者部分功能。
示例的,本申请实施例提供的数据标准管理系统的结构还可以参考前述图1和图2所示的数据标准管理系统,其中,如图7所示,数据标准管理设备102可以包括业务建模模块1021,标准审核模块1022和业务使用模块1023;生命周期管理设备103可以包括标准制定模块1031和数据标准库操作模块1032;数据治理设备104包括数据标准模块1041和元数据模块1042。
其中,业务建模模块1021可以完成前述第一获取模块301、校验模块302和第二获取模块303的功能,即执行前述步骤201至步骤202的动作;标准审核模块1022可以用于进行前述二次校验过程;业务使用模块1023用于在目标数据库表结构确定后,向项目管理员的第一终端设备发送目标数据库表结构使用请求,并在项目管理员通过第一终端设备对数据库表建设完成后,向业务人员的第二终端设备发送数据库表使用通知。
标准制定模块1031用于建立数据标准库中标准数据元的信息,系统管理员可以在线下指定标准数据元的信息,并上传至标准数据库中;数据标准库操作模块1032可以完成前述第二接收模块和操作模块的功能,即执行前述步骤205至步骤206的动作。
数据标准模块1041可以完成前述第三获取模块、第二确定模块和建立模块的功能,即执行前述步骤207的动作。元数据模块1042用于可以设置定时任务,定期采集管理的业务系统的数据库表的信息(如元数据),检查得到业务系统自行新建或更新的数据库表和数据元的信息,进行数据标准规范性校验,该过程可以参考前述步骤202中对应过程。如此可以减少业务系统中格式不符合要求的信息。
可选地,图8示意性地提供本申请所述计算设备的一种可能的基本硬件架构。
参见图8,计算设备400包括处理器401、存储器402、通信接口403和总线404。
计算设备400中,处理器401的数量可以是一个或多个,图8仅示意了其中一个处理器401。可选地,处理器401,可以是中央处理器(central process ing unit,CPU)。如果计算设备400具有多个处理器401,多个处理器401的类型可以不同,或者可以相同。可选地,计算设备400的多个处理器401还可以集成为多核处理器。
存储器402存储计算机指令和数据;存储器402可以存储实现本申请提供的数据重分布方法所需的计算机指令和数据,例如,存储器402存储用于实现数据重分布方法的步骤的指令。存储器402可以是以下存储介质的任一种或任一种组合:非易失性存储器(例如只读存储器(ROM)、固态硬盘(SSD)、硬盘(HDD)、光盘),易失性存储器。
通信接口403可以是以下器件的任一种或任一种组合:网络接口(例如以太网接口)、无线网卡等具有网络接入功能的器件。
通信接口403用于计算设备400与其它计算设备或者终端进行数据通信。
总线404可以将处理器401与存储器402和通信接口403连接。这样,通过总线404,处理器401可以访问存储器402,还可以利用通信接口403与其它计算设备或者终端进行数据交互。
在本申请中,计算设备400执行存储器402中的计算机指令,使得计算设备400实现本申请提供的数据重分布方法,或者使得计算设备400部署数据标准管理系统。
在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如 包括指令的存储器,上述指令可由服务器的处理器执行以完成本申请各个实施例所示的数据处理方法。例如,所述非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
另外,上述实施例提供的数据标准管理系统与数据处理方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
本申请中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
在本申请中,术语“第一”和“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性。术语“多个”指两个或两个以上,除非另有明确的限定。“A参考B”,指的是A与B相同,或者A在B的基础上进行简单变形。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述仅为本申请的可选实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (30)

  1. 一种数据处理方法,其特征在于,所述方法包括:
    获取待上线的第一业务系统的第一数据库表结构的信息,所述第一数据库表结构的信息包括至少一个数据元的信息;
    基于数据标准库对所述第一数据库表结构的信息进行校验,所述数据标准库包括多个标准数据元的信息;
    在对所述第一数据库表结构的信息校验后,获取所述第一业务系统的目标数据库表结构,所述目标数据库表结构基于校验后的所述第一数据库表结构确定。
  2. 根据权利要求1所述的方法,其特征在于,所述基于数据标准库对所述第一数据库表结构的信息进行校验,包括:
    当第一数据元的信息与所述多个标准数据元的信息均不匹配时,发送第一修改提示信息,所述第一修改提示信息指示更新所述第一数据元的信息,所述第一数据元为所述至少一个数据元中的一个数据元;
    在接收到与所述多个标准数据元中任一标准数据元的信息匹配的更新后的所述第一数据元的信息后,确定对所述第一数据元的信息校验成功。
  3. 根据权利要求2所述的方法,其特征在于,所述第一数据库表结构的信息和所述标准数据元的信息均包括数据元标识,所述第一修改提示信息包括至少一个第一标准数据元的信息,所述至少一个第一标准数据元中每个第一标准数据元的数据元标识均与所述第一数据元的数据元标识模糊匹配。
  4. 根据权利要求3所述的方法,其特征在于,所述第一修改提示信息包括多个所述第一标准数据元的信息,多个所述第一标准数据元的信息按照第一标准数据元的数据元标识与所述第一数据元的数据元标识的匹配度降序排序;和/或,按照第一标准数据元所属的数据标准的优先级降序排序。
  5. 根据权利要求2至4任一所述的方法,其特征在于,所述方法还包括:
    接收更新后的所述第一数据库表结构的信息;
    在所述更新后的所述第一数据库表结构的信息中确定增量的数据元,并在所述增量的数据元的信息中确定所述第一数据元的信息;
    或者,在所述更新后的所述第一数据库表结构的信息的全量的数据元的信息中确定所述第一数据元的信息。
  6. 根据权利要求1至5任一所述的方法,其特征在于,所述基于数据标准库对所述第一数据库表结构的信息进行校验,包括:
    当第一数据库表结构的信息的格式不符合指定格式要求时,发送第二修改提示信息,所述第二修改提示信息指示更新所述第一数据库表结构的信息的格式;
    在接收到格式符合所述格式要求的更新后的所述第一数据库表结构的信息后,确定对所述第一数据库表结构的信息的格式校验成功。
  7. 根据权利要求1至6任一所述的方法,其特征在于,在所述获取待上线的第一业务系统的第一数据库表结构的信息之后,所述方法还包括:
    当第二数据元对应的数值为可枚举数值时,为所述第二数据元添加数据元备注信息,所述数据元备注信息用于标识所述第二数据元对应的可枚举数值,所述第二数据元为所述至少一个数据元中的一个数据元。
  8. 根据权利要求1至7任一所述的方法,其特征在于,所述第一数据库表结构的信息和所述标准数据元的信息均包括数据元标识和数据元描述信息,所述获取待上线的第一业务系统的第一数据库表结构的信息,包括:
    接收所述第一数据元的数据元标识,所述第一数据元的数据元标识为所述数据标准库存储的多个标准数据元的数据元标识中的一个;
    在所述数据标准库中获取所述第一数据元的数据元标识对应的数据元描述信息。
  9. 根据权利要求1至7任一所述的方法,其特征在于,所述获取待上线的第一业务系统的第一数据库表结构的信息,包括:
    接收数据标准文档,所述数据标准文档包括所述第一数据库表结构的信息。
  10. 根据权利要求9所述的方法,其特征在于,所述接收数据标准文档,包括:
    接收第三方建模工具基于所述数据标准库生成的所述数据标准文档。
  11. 根据权利要求8或9所述的方法,其特征在于,所述获取待上线的第一业务系统的第一数据库表结构的信息,包括:
    输出数据字典模板,所述数据字典模板为所述第一数据库表结构的信息的参考模板;
    接收基于所述数据字典模板输入的所述第一数据库表结构的信息。
  12. 根据权利要求1至10任一所述的方法,其特征在于,所述方法还包括:
    接收数据标准库操作请求,所述数据标准库操作请求包括标准数据元添加请求、标准数据元更新请求、标准数据元删除请求或标准数据元查询请求;
    在对所述数据标准库操作请求鉴权成功后,对所述数据标准库执行所述数据标准库操作请求所对应的操作。
  13. 根据权利要求1至12任一所述的方法,其特征在于,所述方法还包括:
    获取上线后的第二业务系统的第二数据库表结构的信息,所述第二数据库表结构的信息包括至少一个数据元的信息;
    当基于人工智能模型检测到第三数据元的信息与所述多个标准数据元的信息均不匹配时,在所述多个标准数据元中确定与所述第三数据元对应的目标标准数据元,所述第三数据元为所述第二数据库表结构的信息包括的至少一个数据元的一个数据元;
    基于所述人工智能模型,建立所述第三数据元与所述目标标准数据元的映射关系。
  14. 根据权利要求1至13任一所述的方法,其特征在于,在所述获取所述第一业务系统的目标数据库表结构后,所述方法还包括:
    输出所述目标数据库表结构对应的脚本。
  15. 一种数据标准管理系统,其特征在于,所述系统包括:
    第一获取模块,用于获取待上线的第一业务系统的第一数据库表结构的信息,所述第一数据库表结构的信息包括至少一个数据元的信息;
    校验模块,用于基于数据标准库对所述第一数据库表结构的信息进行校验,所述数据标准库包括多个标准数据元的信息;
    第二获取模块,用于在对所述第一数据库表结构的信息校验后,获取所述第一业务系统的目标数据库表结构,所述目标数据库表结构基于校验后的所述第一数据库表结构确定。
  16. 根据权利要求15所述的系统,其特征在于,所述校验模块,用于:
    当第一数据元的信息与所述多个标准数据元的信息均不匹配时,发送第一修改提示信息,所述第一修改提示信息指示更新所述第一数据元的信息,所述第一数据元为所述至少一个数据元中的一个数据元;
    在接收到与所述多个标准数据元中任一标准数据元的信息匹配的更新后的所述第一数据元的信息后,确定对所述第一数据元的信息校验成功。
  17. 根据权利要求16所述的系统,其特征在于,所述第一数据库表结构的信息和所述标准数据元的信息均包括数据元标识,所述第一修改提示信息包括至少一个第一标准数据元的信息,所述至少一个第一标准数据元中每个第一标准数据元的数据元标识均与所述第一数据元的数据元标识模糊匹配。
  18. 根据权利要求17所述的系统,其特征在于,所述第一修改提示信息包括多个所述第一标准数据元的信息,多个所述第一标准数据元的信息按照第一标准数据元的数据元标识与所述第一数据元的数据元标识的匹配度降序排序;和/或,按照第一标准数据元所属的数据标准的优先级降序排序。
  19. 根据权利要求16至18任一所述的系统,其特征在于,所述系统还包括:
    第一接收模块,用于接收更新后的所述第一数据库表结构的信息;
    第一确定模块,用于:
    在所述更新后的所述第一数据库表结构的信息中确定增量的数据元,并在所述增量的数据元的信息中确定所述第一数据元的信息;
    或者,在所述更新后的所述第一数据库表结构的信息的全量的数据元的信息中确定所述第一数据元的信息。
  20. 根据权利要求15至19任一所述的系统,其特征在于,所述校验模块,用于:
    当第一数据库表结构的信息的格式不符合指定格式要求时,发送第二修改提示信息,所述第二修改提示信息指示更新所述第一数据库表结构的信息的格式;
    在接收到格式符合所述格式要求的更新后的所述第一数据库表结构的信息后,确定对所述第一数据库表结构的信息的格式校验成功。
  21. 根据权利要求15至20任一所述的系统,其特征在于,所述系统还包括:
    备注模块,用于在所述获取待上线的第一业务系统的第一数据库表结构的信息之后,当第二数据元对应的数值为可枚举数值时,为所述第二数据元添加数据元备注信息,所述数据元备注信息用于标识所述第二数据元对应的可枚举数值,所述第二数据元为所述至少一个数据元中的一个数据元。
  22. 根据权利要求15至21任一所述的系统,其特征在于,所述第一数据库表结构的信息和所述标准数据元的信息均包括数据元标识和数据元描述信息,所述第一获取模块,用于:
    接收所述第一数据元的数据元标识,所述第一数据元的数据元标识为所述数据标准库存储的多个标准数据元的数据元标识中的一个;
    在所述数据标准库中获取所述第一数据元的数据元标识对应的数据元描述信息。
  23. 根据权利要求15至22任一所述的系统,其特征在于,所述第一获取模块,用于:
    接收数据标准文档,所述数据标准文档包括所述第一数据库表结构的信息。
  24. 根据权利要求23所述的系统,其特征在于,所述第一获取模块,用于:
    接收第三方建模工具基于所述数据标准库生成的所述数据标准文档。
  25. 根据权利要求23或24所述的系统,其特征在于,所述第一获取模块,用于:
    输出数据字典模板,所述数据字典模板为所述第一数据库表结构的信息的参考模板;
    接收基于所述数据字典模板输入的所述第一数据库表结构的信息。
  26. 根据权利要求15至25任一所述的系统,其特征在于,所述系统还包括:
    第二接收模块,用于接收数据标准库操作请求,所述数据标准库操作请求包括标准数据元添加请求、标准数据元更新请求、标准数据元删除请求或标准数据元查询请求;
    操作模块,用于在对所述数据标准库操作请求鉴权成功后,对所述数据标准库执行所述数据标准库操作请求所对应的操作。
  27. 根据权利要求15至26任一所述的系统,其特征在于,所述系统还包括:
    第三获取模块,用于获取上线后的第二业务系统的第二数据库表结构的信息,所述第二数据库表结构的信息包括至少一个数据元的信息;
    第二确定模块,用于当基于人工智能模型检测到第三数据元的信息与所述多个标准 数据元的信息均不匹配时,在所述多个标准数据元中确定与所述第三数据元对应的目标标准数据元,所述第三数据元为所述第二数据库表结构的信息包括的至少一个数据元的一个数据元;
    建立模块,用于基于所述人工智能模型,建立所述第三数据元与所述目标标准数据元的映射关系。
  28. 根据权利要求15至27任一所述的系统,其特征在于,所述系统还包括:
    输出模块,用于在所述获取所述第一业务系统的目标数据库表结构后,输出所述目标数据库表结构对应的脚本。
  29. 一种计算机设备,其特征在于,包括:
    处理器和存储器;
    所述存储器,用于存储计算机指令;
    所述处理器,用于执行所述存储器存储的计算机指令,使得所述计算设备执行权利要求1至14任一所述的数据处理方法。
  30. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质包括计算机指令,所述计算机指令指示计算设备执行权利要求1至14任一所述的数据处理方法。
PCT/CN2021/075477 2020-03-19 2021-02-05 数据处理方法及数据标准管理系统 WO2021184995A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010197689.7A CN113495902A (zh) 2020-03-19 2020-03-19 数据处理方法及数据标准管理系统
CN202010197689.7 2020-03-19

Publications (1)

Publication Number Publication Date
WO2021184995A1 true WO2021184995A1 (zh) 2021-09-23

Family

ID=77767982

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/075477 WO2021184995A1 (zh) 2020-03-19 2021-02-05 数据处理方法及数据标准管理系统

Country Status (2)

Country Link
CN (1) CN113495902A (zh)
WO (1) WO2021184995A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114638729A (zh) * 2022-05-18 2022-06-17 国网浙江省电力有限公司 基于能源互联网营销服务的双中台架构的电力稽查方法
CN115982137A (zh) * 2023-03-17 2023-04-18 鲁班(北京)电子商务科技有限公司 一种数据名称和数据库建表生成方法及系统
CN117235077A (zh) * 2023-11-15 2023-12-15 青岛民航凯亚系统集成有限公司 一种基于数据编织的机场智能化数据治理方法及系统

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115543977A (zh) * 2022-09-29 2022-12-30 河北雄安睿天科技有限公司 一种供水行业数据清洗方法
CN117389996B (zh) * 2023-12-11 2024-03-29 深圳万物安全科技有限公司 数据库优化建议生成方法、终端设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106528828A (zh) * 2016-11-22 2017-03-22 山东浪潮云服务信息科技有限公司 一种基于多维度校验规则的数据质量检测方法
CN107844588A (zh) * 2017-11-17 2018-03-27 中国银行股份有限公司 一种数据字典的处理方法、装置、存储介质及处理器
US20190205294A1 (en) * 2016-09-30 2019-07-04 Microsoft Technology Licensing, Llc Reducing processing for comparing large metadata sets
CN110008193A (zh) * 2019-04-16 2019-07-12 成都四方伟业软件股份有限公司 数据标准化方法及装置
CN110389941A (zh) * 2019-06-19 2019-10-29 平安国际智慧城市科技股份有限公司 数据库校验方法、装置、设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190205294A1 (en) * 2016-09-30 2019-07-04 Microsoft Technology Licensing, Llc Reducing processing for comparing large metadata sets
CN106528828A (zh) * 2016-11-22 2017-03-22 山东浪潮云服务信息科技有限公司 一种基于多维度校验规则的数据质量检测方法
CN107844588A (zh) * 2017-11-17 2018-03-27 中国银行股份有限公司 一种数据字典的处理方法、装置、存储介质及处理器
CN110008193A (zh) * 2019-04-16 2019-07-12 成都四方伟业软件股份有限公司 数据标准化方法及装置
CN110389941A (zh) * 2019-06-19 2019-10-29 平安国际智慧城市科技股份有限公司 数据库校验方法、装置、设备及存储介质

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114638729A (zh) * 2022-05-18 2022-06-17 国网浙江省电力有限公司 基于能源互联网营销服务的双中台架构的电力稽查方法
CN114638729B (zh) * 2022-05-18 2022-08-02 国网浙江省电力有限公司 基于能源互联网营销服务的双中台架构的电力稽查方法
CN115982137A (zh) * 2023-03-17 2023-04-18 鲁班(北京)电子商务科技有限公司 一种数据名称和数据库建表生成方法及系统
CN117235077A (zh) * 2023-11-15 2023-12-15 青岛民航凯亚系统集成有限公司 一种基于数据编织的机场智能化数据治理方法及系统
CN117235077B (zh) * 2023-11-15 2024-03-08 青岛民航凯亚系统集成有限公司 一种基于数据编织的机场智能化数据治理方法及系统

Also Published As

Publication number Publication date
CN113495902A (zh) 2021-10-12

Similar Documents

Publication Publication Date Title
WO2021184995A1 (zh) 数据处理方法及数据标准管理系统
CN110471916B (zh) 数据库的查询方法、装置、服务器及介质
CN104967620B (zh) 一种基于属性访问控制策略的访问控制方法
TWI634449B (zh) Sql審核方法和裝置
WO2019227573A1 (zh) 协同办公数据流处理方法、装置、计算机设备及存储介质
US20180018353A1 (en) Systems and Methods for Generating Schemas that Represent Multiple Data Sources
US8869111B2 (en) Method and system for generating test cases for a software application
WO2021051546A1 (zh) 一种链路异常识别方法、服务器及计算机可读存储介质
WO2021212757A1 (zh) 集群的升级维护方法、装置、电子设备及存储介质
WO2021051627A1 (zh) 基于数据库的批量导入方法、装置、设备及存储介质
WO2019196239A1 (zh) 一种线程接口的管理方法、终端设备及计算机可读存储介质
CN112256698B (zh) 一种基于多哈希函数的表关系自动关联方法
CN113434482A (zh) 数据迁移方法、装置、计算机设备及存储介质
US20160378817A1 (en) Systems and methods of identifying data variations
CN111914101B (zh) 档案关联关系的异常识别方法、装置和计算机设备
US8468116B2 (en) Rule creation method and rule creating apparatus
CN112433753A (zh) 基于参数信息的接口文档生成方法、装置、设备和介质
US10003492B2 (en) Systems and methods for managing data related to network elements from multiple sources
CN115840738A (zh) 一种数据迁移方法、装置、电子设备及存储介质
CN113591162B (zh) 区块链存证方法、装置和计算机设备
CN113824717B (zh) 一种配置检查方法及装置
CN115543428A (zh) 一种基于策略模板的模拟数据生成方法和装置
CN114048219A (zh) 图数据库更新方法及装置
CN114489772A (zh) 工作流执行方法及装置、存储介质、设备
CN109522915B (zh) 病毒文件聚类方法、装置及可读介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21771085

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21771085

Country of ref document: EP

Kind code of ref document: A1