CN116501375B - Data dictionary version management method, device, computer equipment and storage medium - Google Patents

Data dictionary version management method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN116501375B
CN116501375B CN202310737245.1A CN202310737245A CN116501375B CN 116501375 B CN116501375 B CN 116501375B CN 202310737245 A CN202310737245 A CN 202310737245A CN 116501375 B CN116501375 B CN 116501375B
Authority
CN
China
Prior art keywords
data
data dictionary
dictionary
merging
software service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310737245.1A
Other languages
Chinese (zh)
Other versions
CN116501375A (en
Inventor
冯斌
邱龙根
朱家祺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Fulin Technology Co Ltd
Original Assignee
Shenzhen Fulin Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Fulin Technology Co Ltd filed Critical Shenzhen Fulin Technology Co Ltd
Priority to CN202310737245.1A priority Critical patent/CN116501375B/en
Publication of CN116501375A publication Critical patent/CN116501375A/en
Application granted granted Critical
Publication of CN116501375B publication Critical patent/CN116501375B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/71Version control; Configuration management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/219Managing data history or versioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24554Unary operations; Data partitioning operations
    • G06F16/24556Aggregation; Duplicate elimination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Stored Programmes (AREA)

Abstract

The invention relates to the field of data management, and discloses a data dictionary version management method, a device, computer equipment and a storage medium, wherein the method comprises the following steps: acquiring a description interface of a software service platform, and extracting a first data dictionary of data fields of the software service platform by accessing the description interface; acquiring recent historical data of the software service platform, and identifying the historical data to obtain a second data dictionary of data fields of the software service platform; combining the data of the first data dictionary and the data of the second data dictionary to obtain a first combined data dictionary; and acquiring the latest intermediate data dictionary of the software service platform, combining the first combined data dictionary with the latest intermediate data dictionary to obtain a second combined data dictionary, and applying the second combined data dictionary as the latest data dictionary to the software service platform. Therefore, the unification of the data dictionary in the software service platform is ensured through multiple merging operations.

Description

Data dictionary version management method, device, computer equipment and storage medium
Technical Field
The present invention relates to the field of data management, and in particular, to a method and apparatus for managing a version of a data dictionary, a computer device, and a storage medium.
Background
The SaaS software platform service is a mainstream choice for small and medium enterprises to solve business demands without putting in excessive technical resources, and most SaaS platforms provide an open interface platform for platform use enterprises so as to facilitate users to flexibly access, operate and count their data records. Because the SaaS platform needs to provide self-service features for enterprises of different services, flexible and variable data view definitions are provided, and certain pattern associations and capabilities are abandoned for popularity. (1) Because of the nature of self-service, it is readily changeable. In particular, the enumerated codes and changes in the service display values can negatively affect the display of data warehouses and reports; (2) In the same field, two or more sets of SaaS platforms exist in sequence or simultaneously in the fields of marketing, manpower and the like, and the dictionary change of the SaaS platforms can cause inconsistent and unstable data platforms.
Disclosure of Invention
The application provides a data dictionary version management method, a data dictionary version management device, computer equipment and a storage medium, so as to solve the problem of instability caused by non-unification of a data dictionary.
In a first aspect, the present application provides a method for managing a version of a data dictionary, including:
acquiring a description interface of a software service platform, and extracting a first data dictionary of data fields of the software service platform by accessing the description interface;
acquiring recent historical data of the software service platform, and identifying the historical data to obtain a second data dictionary of data fields of the software service platform;
combining the data of the first data dictionary and the second data dictionary to obtain a first combined data dictionary;
acquiring a latest intermediate data dictionary of the software service platform, and combining the first combined data dictionary and the latest intermediate data dictionary to obtain a second combined data dictionary;
and applying the second combined data dictionary as the latest data dictionary to the software service platform.
Further, if there are multiple software service platforms, the obtaining the second merged data dictionary further includes:
acquiring second merging data dictionaries of different software service platforms, merging the second data dictionaries according to rule verification, and obtaining a third merging data dictionary;
and merging the latest historical data dictionary with the third merged data dictionary to obtain a merged data dictionary, and applying the merged data dictionary as the latest data dictionary to the software service platform.
Further, the merging process includes:
determining the same data field in the first data source and the second data source which participate in merging;
comparing the same data fields in the first data source and the second data source, and determining the difference types among the same data fields;
and determining a corresponding merging method according to the difference types so as to merge the first data source and the second data source.
Further, the determining a corresponding merging method according to the difference type to merge the first data source and the second data source includes:
if the stored values are different in the same data field, merging is performed based on the format of the value of the first data source;
if the stored value formats in the same data field conflict, the format of the value of the second data source is connected and expanded with the format of the value of the first data source in a connection mode;
if the enumerated items, the code values and the tag values are different in the same data field, directly merging the values in the data field;
if the enumerated items are different in the same data field and the same part exists between the code value and the label value, the association of the directed acyclic graph is constructed through association matching and fuzzy matching, and a unique data warehouse code is formed.
Further, the method further comprises:
if the software service platform does not have the description interface, only acquiring the second data dictionary, and taking the second data dictionary as the first combined data dictionary;
and merging the latest intermediate data dictionary of the software service platform with the first merged data dictionary to obtain the second merged data dictionary.
Further, the method further comprises:
if collision which cannot be combined is found, carrying out abnormal alarm, stopping combining, and taking the latest historical data dictionary as the latest version of data dictionary;
and if no conflict is found, taking the merged data dictionary as the latest version of the data dictionary.
Further, after extracting the first data dictionary of the data field of the software service platform through the description interface, the method further includes:
and extracting field names and corresponding values in the first data dictionary, and converting the field names and the corresponding values into a file in a JSON format.
In a second aspect, the present application further provides a data dictionary version management apparatus, including:
the first internal processing module is used for acquiring a description interface of the software service platform and extracting a first data dictionary of data fields of the software service platform by accessing the description interface;
the second internal processing module is used for acquiring the latest historical data of the software service platform, and identifying the historical data to obtain a second data dictionary of the data field of the software service platform;
the first merging module is used for merging the data of the first data dictionary and the second data dictionary to obtain a first merged data dictionary;
and the second merging module is used for acquiring the latest intermediate data dictionary of the software service platform, merging the first merging data dictionary with the latest intermediate data dictionary to obtain a second merging data dictionary, and applying the second merging data dictionary as the latest data dictionary to the software service platform.
In a third aspect, the present application also provides a computer device comprising a processor and a memory, the memory storing a computer program which, when run on the processor, performs the data dictionary version management method.
In a fourth aspect, the present application also provides a readable storage medium storing a computer program which, when run on a processor, performs the data dictionary version management method.
Compared with the prior art, the application has the following main beneficial effects:
the invention relates to the field of data management, and discloses a data dictionary version management method, a device, computer equipment and a storage medium, wherein the method comprises the following steps: acquiring a description interface of a software service platform, and extracting a first data dictionary of each field of each object of the service platform by accessing the description interface; acquiring latest historical data of the software service platform, and identifying the historical data to obtain a second data dictionary of the historical data; combining the data of the first data dictionary and the second data dictionary according to a combination rule to obtain a first combined data dictionary; and acquiring the latest intermediate data dictionary of the software service platform, and combining the first combined data dictionary and the latest intermediate data dictionary according to the combination rule to obtain a second combined data dictionary. Therefore, the unification of the data dictionary in the software service platform is ensured through multiple merging operations, so that a service user can use the self-service function to adjust fields in time, the need of waiting for data research and development to deal with changes is avoided, and the operation efficiency is improved.
Drawings
In order to more clearly illustrate the technical solutions of the present invention, the drawings that are required for the embodiments will be briefly described, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope of the present invention. Like elements are numbered alike in the various figures.
FIG. 1 is a schematic flow chart of a data dictionary version management method according to an embodiment of the present application;
FIG. 2 is a flowchart illustrating another method for managing version of a data dictionary according to an embodiment of the present application;
FIG. 3 illustrates a data dictionary management flow diagram for multiple platforms in accordance with embodiments of the present application;
FIG. 4 is a schematic diagram of a data dictionary merging process according to an embodiment of the present application;
fig. 5 shows a schematic diagram of a data dictionary version management apparatus according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments.
The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present invention.
The terms "comprises," "comprising," "including," or any other variation thereof, are intended to cover a specific feature, number, step, operation, element, component, or combination of the foregoing, which may be used in various embodiments of the present invention, and are not intended to first exclude the presence of or increase the likelihood of one or more other features, numbers, steps, operations, elements, components, or combinations of the foregoing.
Furthermore, the terms "first," "second," "third," and the like are used merely to distinguish between descriptions and should not be construed as indicating or implying relative importance.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which various embodiments of the invention belong. The terms (such as those defined in commonly used dictionaries) will be interpreted as having a meaning that is the same as the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein in connection with the various embodiments of the invention.
The data dictionary is defined and described for data items, data structures, data streams, data stores, processing logic, etc. of data, and is used for describing each element in the data flow diagram in detail, and the data dictionary is used for simple modeling items. In short, a data dictionary is a collection of information describing data, which is a collection of definitions for all data elements used in a system.
The technical scheme is applied to maintaining the data dictionary of the software service platform in the service field, and particularly aims at combining fields with the same data fields in different service platforms. Firstly, data description of a software service platform is acquired, so that a data dictionary of the current software service platform is obtained, and then the data dictionary and a data dictionary with the latest history are compared and combined, so that the latest data dictionary after combination is obtained. Meanwhile, if a plurality of different software service platforms exist in the service field, the data dictionaries among the different platforms are combined, and then the data dictionaries are combined with the latest historical data dictionary in the service field, so that the combined data dictionary in the service field is obtained, and at the moment, all the software service platforms in the service field can use the data dictionary to ensure that the data fields in the service field are unified and no conflict exists.
The technical scheme of the application is described in the following specific embodiments.
Example 1
The software service platform is a platform that enables users to connect and use cloud-based applications via the internet. Common examples are email, calendar and office tools. It does not require the user to install the software product on his own computer or server. In corporate enterprises, the use of software service platforms can greatly save costs and optimize internal flows.
However, because different services have respective sets of operation logic in different interiors, and meanwhile, because of the subdivision in departments and the difference of interaction types of the internal and external services, multiple software service platforms exist in the same service field or department or are changed for multiple times, so that the data dictionary is not matched.
To this end, as shown in fig. 1, the present application provides a data dictionary version management method including:
step S100, a description interface of a software service platform is obtained, and a first data dictionary of data fields of the software service platform is extracted by accessing the description interface.
The description interface in the software service platform can obtain the data dictionary in the software service platform for an external visitor, and each data field and the value corresponding to the data field can be directly obtained through the description interface.
It will be appreciated that the data dictionary directly derived from the description interface is a long string of non-formatted character strings, and that in order to make the data dictionary readable and processable, it is also necessary to format the data obtained, for example, to process the obtained character string into a JSON formatted file.
Specifically, the following table shows:
the field names are actual field names in the codes, and the field examples are possible forms of the corresponding values under the field names.
All the same keywords of the previous N lines of data can be extracted and combined into a field list, the field list is converted according to the original access data dictionary format, all values (less than or equal to N) of the field are aggregated, automatic judgment is carried out according to a rule engine, the automatic judgment comprises the adaptation of common date formats, the identification of the amount and the precision, and the matching of known enumeration items-values: this step allows for incomplete information to be complemented by the reflection of other reference data and the gradual merging. Where N is the number of rows of the same key. Thus, data resembling JSON or other formats can be formed. Thereby, a first data dictionary is obtained.
Step S200, the latest historical data of the software service platform is obtained, and the historical data is identified to obtain a second data dictionary of the data fields of the software service platform.
The above-mentioned historical data refers to a series of actual data generated by the software service platform when performing business operations, it can be understood that the actual data is generated in the actual working process, the working environment of the staff is different from the environment of the developer, so that there may be different values in some data fields and forms of internal code definition, for example, for the definition of time, the time defined in the code is year, month, day, and month, day, minute, second in the actual working process, and three bits of time, second, later than the time defined in the internal part, are all belonging to the "time" data field. The above-mentioned different reasons may be that the data source is changed, or that the developer may have coding errors, and because the above-mentioned problems may exist in the same platform, the first data dictionary may be obtained by performing the operation of step S100, and the history data may be directly obtained, summarized, and further the second data dictionary may be obtained according to the actual data.
It is to be understood that, when the second data dictionary is acquired, the formatting process as described in step S100 may also be performed to convert the acquired history data into the second data dictionary in JSON format.
In addition, the first data dictionary and the second data dictionary may not be data dictionaries of all data fields in the platform, but may be data dictionaries of a certain structural object, and the data dictionaries are acquired in batches for different data objects, so that the acquisition of all data dictionaries is completed step by step.
And step S300, merging the data of the first data dictionary and the second data dictionary to obtain a first merged data dictionary.
After the first data dictionary and the second data dictionary are obtained, the two data dictionaries can be combined.
Specifically, the merging operation in this embodiment is a correction operation, and different portions of the two data dictionaries to be merged are given to be modified into the same portion, so as to obtain a unified data dictionary, and for this purpose, the first data dictionary and the second data dictionary are merged according to the merging rule, so as to obtain the first merged data dictionary.
It should be noted that, some software service platforms may not describe an interface, and for such a platform, only the operation of step S200 will be performed, and the merging operation of this step will not be performed, but the second data dictionary will be directly used as the first merged data dictionary.
Step S400, obtaining the latest intermediate data dictionary of the software service platform, combining the first combined data dictionary and the latest intermediate data dictionary to obtain a second combined data dictionary, and applying the second combined data dictionary as the latest data dictionary to the software service platform.
The latest intermediate data dictionary refers to the data dictionary obtained after the last combination belonging to the software service platform.
Specifically, the first merged data dictionary is a current time point, the data in the platform is collected and merged to obtain a data dictionary of the current moment of the data, and the latest intermediate data dictionary is a second merged data dictionary obtained after the software service platform performs the merging operation of the step last time, so that the content of the first merged data dictionary is checked by merging the current first merged data dictionary with the latest intermediate data dictionary, and the updating of the latest intermediate data dictionary is completed.
And step S500, the second combined data dictionary is used as the latest data dictionary to be applied to the software service platform.
It can be understood that the first merged data dictionary represents the current latest data dictionary, and the second merged data dictionary is a data dictionary obtained by checking and merging the latest intermediate data dictionary, because the data dictionaries at two different time points are merged, the data dictionary can be updated and correct in real time, and the data inside the platform can be ensured to be updated into the latest data dictionary in real time. The second merged data dictionary thus obtained may be directly applied as the latest data dictionary to the software service platform.
Further, for a department or a business domain, there may be a plurality of software service platforms, so for the case of a plurality of software service platforms, as shown in fig. 2, the data dictionary management method of the present embodiment further includes:
and S600, obtaining second merging data dictionaries of different software service platforms, and merging the second data dictionaries according to rule checking to obtain a third merging data dictionary.
For convenience of explanation, this embodiment will be described by taking two software service platforms as examples.
As shown in fig. 3, the platform 1 and the platform 2 represent two software service platforms in the same department, and have the same data fields inside, so that the data dictionary inside the platforms needs to be the same because the platforms work in the same department, so that the internal interaction does not have a problem.
The platform 1 and the platform 2 may obtain the second merged data dictionary according to the operations from the step S100 to the step S400, and the merging rule is similar to the merging rule in the step S300, except that the second merged data dictionaries in the two platforms are merged at this time, so as to finally form a unified third merged data dictionary.
And step S700, merging the latest historical data dictionary and the third merged data dictionary to obtain a merged data dictionary, and applying the merged data dictionary as the latest data dictionary to the software service platform.
It will be understood that, for the entire business department, there is also a latest history data dictionary, which can be understood as the final merging result obtained after the data dictionary merging operation performed last time by both platforms currently being commonly used, and the latest history data dictionary and the third merged data dictionary obtained in the above step S500 are merged, so that the latest history data dictionary can be updated, and thus the updated part and content of the latest history data dictionary can be obtained from the third merged data dictionary, so as to avoid the problem of different data contents between the platform 1 and the platform 2.
For example, in the actual production environment, because a certain service requirement is changed in the platform 1 and the data field in the platform 2 is not changed in the platform 2, the merging operation in the step S500 will feed back the change information to the third merged data dictionary, and the third merged data dictionary will also bring the change information to the latest historical data dictionary when the third merged data dictionary is the latest historical data dictionary, thereby completing the update of the data dictionary, and further, the corresponding data field in the platform 2 will also be changed identically, so that the consistency of the data dictionary between the two platforms is maintained.
It will be appreciated that the method of this embodiment may be used on more than two platforms, or may be used on three or more platforms, and if three or more platforms are used, the order of merging between the platforms may need to be set.
For example, if there are four platforms A, B, C, D, the merging order is set from a to D, the second merged data dictionary of a and B will be merged first to obtain the third merged data dictionary of AB two platforms, then the third merged data dictionary is merged with the second merged data dictionary of C platform to obtain the third merged data dictionary of ABC three platforms, and so on, finally the third merged data dictionary belonging to ABCD four platforms is obtained, and then the third merged data dictionary is merged with the latest history data dictionary to obtain the merged data dictionary of ABCD four platforms.
It will be appreciated that the data dictionary version management method of the present embodiment may be periodically executed, for example, once a day, once a week, once a month, etc., to ensure that the data dictionary between the software service platforms within the same department may be uniform through a periodic merging operation. Meanwhile, the method can be applied to different departments, and it can be understood that for an enterprise, the internal system has a lot of data which need to be processed across departments, for example, a performance group for calculating performance needs to acquire related data of performance management in other departments, and the data may be different in different departments.
Therefore, the data dictionary version management method of the embodiment can be applied to at least one software service platform, so that when the method is used for a single platform, the difference of internal data description and actual interaction data caused by errors of a data source or a developer can be avoided, self-maintenance and real-time more operation of the data dictionary can be realized, for a plurality of software service platforms, the data dictionary among the plurality of software service platforms can be combined and integrated, and finally a complete combined data dictionary which can be applied to all the software service platforms is generated, so that the data dictionary among the plurality of software service platforms is ensured to be the same, enterprises can freely switch the software service platforms in the same field or use the plurality of software service platforms at the same time without excessively considering the risks of data integration and report forms, the accuracy and the stability of the data report forms and the data application are ensured, and the maintenance efficiency of the data warehouse and the report forms is improved.
Furthermore, in the technical solution of the embodiment, multiple merging operations are performed, and as shown in the above steps, the data involved in merging may be data dictionaries in different software service platforms and data dictionaries of different versions in different time periods, so that merging is also required through a unified merging process, so as to ensure that the merged data is free of errors, and some conflict problems can be automatically solved.
Preferably, this embodiment further provides a merging process, as shown in fig. 4, including:
step S800, determining the same data fields in the first data source and the second data source participating in the merging.
The first data source and the second data source are data sources participating in merging, for example, the first data dictionary and the second data dictionary in the first embodiment, or the first merged data dictionary and the latest intermediate data dictionary, etc. That is, the merging flow of the present embodiment may be applied to any one of the merging operations of the above-described steps S100 to S600.
For convenience of the following description, in this embodiment, the data in the first data source is used as the basis for merging, and all merging operations are in the first data source.
When merging, the description and the value in the same data field are required to be identical, so that all merging operations are performed on the same data field, and for this reason, the same data fields in the first data source and the second data source are required to be matched with each other, and after comparison, the subsequent merging operations are performed.
Step S900, comparing the same data fields in the first data source and the second data source, and determining the difference types between the same data fields.
The values or descriptions in the same data fields may differ for various reasons, wherein different ways are needed to handle different discrepancy situations, for which purpose these data fields need to be compared in order to get the corresponding discrepancy category.
In particular, the data in these data fields may be format-matched and determined whether the descriptions are the same, and based on the matching result, the difference type is determined, for example, a certain data field in the first data source has a value, and the corresponding data field in the second data source has no value, for which the difference type may be classified as NULL, and the difference type may be represented by a NULL character. If format conflict is found, the comparison and difference classification between the same data fields in the first data source and the second data source can be completed through conflict character marking, and if the format conflict is completely the same, no marking can be carried out.
Step S1000, determining a corresponding merging method according to the difference type to merge the first data source and the second data source.
If the stored values are different in the same data field, merging is performed based on the format of the value of the first data source.
It can be understood that the merging is to unify the values in the same data field in two different data dictionaries, and if the values are only different, the merging can be performed by taking the value of any one data source as the standard, and in this embodiment, the merging is performed by taking the value in the first data source as the standard.
If the stored value formats in the same data field conflict, the format of the value of the second data source is connected and expanded with the format of the value of the first data source in a connection mode, and if the value of the second data source cannot be connected, an alarm is given.
It will be appreciated that there are many forms of format conflicts, some of which can be directly processed in a splicing manner, some of which are complex and not suitable for direct splicing, for example, if the format of the time stamp is year, month, day, and time, the unification of the time stamp item can be completed by only filling the time bit. For format conflict of range values, there may be an unconnectable situation, for which an early warning needs to be made to inform the developer of the manual modification operation.
If the enumerated items, the code values and the tag values are different in the same data field, directly merging the values in the data field.
The enumeration item, the code value and the label value are different, so that new data is possible, and the enumeration item, the code value and the label value are directly combined into new values.
If the enumerated items are different in the same data field and a repeated part exists between the code value and the label value, the association of the directed acyclic graph is constructed through association matching and fuzzy matching, and a unique data warehouse code is formed.
The enumerated items are different, but the code value and the label value are partially repeated, so that the mapping relation cannot be directly combined, and the situation that the connection between the code value and the label value cannot be closed loop is considered, so that the connection between the code value and the label value is required to be matched through association and fuzzy, the establishment of the association of the directed acyclic graph is attempted, the combination is successful when the establishment is successful, and the alarm is given when the establishment is failed.
It can be understood that in the above merging operation, most cases can be automatically combined and successfully combined to complete the unification and merging of the two data dictionaries, and only alarm processing can be performed for some cases, so as to seek manual intervention of a developer to complete the elimination of the problem.
Example 2
As shown in fig. 5, the present embodiment provides a data dictionary version management apparatus, including:
the first internal processing module 10 is configured to obtain a description interface of a software service platform, and extract a first data dictionary of data fields of the software service platform by accessing the description interface;
the functional operation of the first internal processing module 10 is similar to that of step S100 in embodiment 1, and will not be described here.
The second internal processing module 20 is configured to obtain recent historical data of the software service platform, and identify the historical data to obtain a second data dictionary of data fields of the software service platform;
the functional operation of the second internal processing module 20 is similar to that of step S200 in embodiment 1, and will not be described here.
A first merging module 30, configured to merge the data of the first data dictionary and the second data dictionary to obtain a first merged data dictionary;
the functional operation of the first combining module 30 is similar to that of step S300 in embodiment 1, and will not be described here.
And the second merging module 40 is configured to obtain the latest intermediate data dictionary of the software service platform, merge the first merged data dictionary with the latest intermediate data dictionary to obtain a second merged data dictionary, and apply the second merged data dictionary as the latest data dictionary to the software service platform.
The functional operation of the second combining module 40 is similar to that of step S400 in embodiment 1, and will not be described here.
The data dictionary version management device of the embodiment can be applied to at least one software service platform, so that when the device is used for a single platform, the difference of internal data description and actual interaction data caused by errors of a data source or a developer can be avoided, self-maintenance and real-time more operation of the data dictionary can be realized, for a plurality of software service platforms, the data dictionary among the plurality of software service platforms can be combined and integrated, and finally a complete combined data dictionary which can be applied to all the software service platforms is generated, so that the data dictionary among the plurality of software service platforms is ensured to be the same, enterprises can freely switch the software service platforms in the same field or simultaneously use the plurality of software service platforms without excessively considering the risks of data integration and report forms, the accuracy and the stability of the data report forms and the data application are ensured, and the maintenance efficiency of the data warehouse and the report forms is improved.
Example 3
The present application also provides a computer device comprising a processor and a memory, the memory storing a computer program which, when run on the processor, performs the data dictionary version management method.
The computer device of the present embodiment may be a personal computer, an intelligent terminal, or a server, etc. capable of running the program of the above embodiment, and may be a stand-alone device accessed by a third party, instead of being a carrier of a software service platform. Such as a stand-alone server or a workstation accessing the server, etc.
Example 4
The present application also provides a readable storage medium storing a computer program which when run on a processor performs the data dictionary version management method.
The readable storage medium of the present embodiment may be a nonvolatile storage medium or a volatile storage medium. For example, a storage medium such as a usb disk, a mechanical hard disk, a solid state hard disk, a mobile hard disk, and a cache may store or execute the above-described data dictionary version management method.
In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other manners as well. The apparatus embodiments described above are merely illustrative, for example, of the flow diagrams and block diagrams in the figures, which illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules or units in various embodiments of the invention may be integrated together to form a single part, or the modules may exist alone, or two or more modules may be integrated to form a single part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a smart phone, a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention.

Claims (7)

1. A method for managing versions of a data dictionary, comprising:
acquiring a description interface of a software service platform, and extracting a first data dictionary of data fields of the software service platform by accessing the description interface;
acquiring recent historical data of the software service platform, and identifying the historical data to obtain a second data dictionary of data fields of the software service platform;
combining the data of the first data dictionary and the second data dictionary to obtain a first combined data dictionary;
acquiring a latest intermediate data dictionary of the software service platform, and combining the first combined data dictionary and the latest intermediate data dictionary to obtain a second combined data dictionary; the latest intermediate data dictionary is a second merged data dictionary obtained by last merging;
applying the second merged data dictionary as the latest data dictionary to the software service platform;
if a plurality of software service platforms exist, the obtaining the second merged data dictionary further includes:
acquiring second merging data dictionaries of different software service platforms, merging the second data dictionaries according to rule verification, and obtaining a third merging data dictionary;
combining the latest historical data dictionary with the third combined data dictionary to obtain a combined data dictionary, and applying the combined data dictionary as the latest data dictionary to the software service platform;
the merging process comprises the following steps:
determining the same data field in the first data source and the second data source which participate in merging;
comparing the same data fields in the first data source and the second data source, and determining the difference types among the same data fields;
if the stored values are different in the same data field, merging is performed based on the format of the value of the first data source;
if the stored value formats in the same data field conflict, the format of the value of the second data source is connected and expanded with the format of the value of the first data source in a connection mode;
if the enumerated items, the code values and the tag values are different in the same data field, directly merging the values in the data field;
if the enumerated items are different in the same data field and the same part exists between the code value and the label value, the association of the directed acyclic graph is constructed through association matching and fuzzy matching, and a unique data warehouse code is formed.
2. The data dictionary version management method of claim 1, further comprising:
if the software service platform does not have the description interface, only acquiring the second data dictionary, and taking the second data dictionary as the first combined data dictionary;
and merging the latest intermediate data dictionary of the software service platform with the first merged data dictionary to obtain the second merged data dictionary.
3. The data dictionary version management method of claim 1, further comprising:
if collision which cannot be combined is found, carrying out abnormal alarm, stopping combining, and taking the latest historical data dictionary as the latest version of data dictionary;
and if no conflict is found, taking the merged data dictionary as the latest version of the data dictionary.
4. The method for managing version of data dictionary according to claim 1, wherein after extracting the first data dictionary of the data fields of the software service platform by accessing the description interface, further comprising:
and extracting field names and corresponding values in the first data dictionary, and converting the field names and the corresponding values into a file in a JSON format.
5. A data dictionary version management apparatus, comprising:
the first internal processing module is used for acquiring a description interface of the software service platform and extracting a first data dictionary of data fields of the software service platform by accessing the description interface;
the second internal processing module is used for acquiring the latest historical data of the software service platform, and identifying the historical data to obtain a second data dictionary of the data field of the software service platform;
the first merging module is used for merging the data of the first data dictionary and the second data dictionary to obtain a first merged data dictionary;
the second merging module is used for acquiring the latest intermediate data dictionary of the software service platform, merging the first merging data dictionary with the latest intermediate data dictionary to obtain a second merging data dictionary, and applying the second merging data dictionary as the latest data dictionary to the software service platform; the latest intermediate data dictionary is a second merged data dictionary obtained by last merging;
if a plurality of software service platforms exist, the obtaining the second merged data dictionary further includes:
acquiring second merging data dictionaries of different software service platforms, merging the second data dictionaries according to rule verification, and obtaining a third merging data dictionary;
combining the latest historical data dictionary with the third combined data dictionary to obtain a combined data dictionary, and applying the combined data dictionary as the latest data dictionary to the software service platform;
the merging process comprises the following steps:
determining the same data field in the first data source and the second data source which participate in merging;
comparing the same data fields in the first data source and the second data source, and determining the difference types among the same data fields;
if the stored values are different in the same data field, merging is performed based on the format of the value of the first data source;
if the stored value formats in the same data field conflict, the format of the value of the second data source is connected and expanded with the format of the value of the first data source in a connection mode;
if the enumerated items, the code values and the tag values are different in the same data field, directly merging the values in the data field;
if the enumerated items are different in the same data field and the same part exists between the code value and the label value, the association of the directed acyclic graph is constructed through association matching and fuzzy matching, and a unique data warehouse code is formed.
6. A computer device comprising a processor and a memory, the memory storing a computer program which, when run on the processor, performs the data dictionary version management method of any one of claims 1 to 4.
7. A readable storage medium, characterized in that it stores a computer program which, when run on a processor, performs the data dictionary version management method of any one of claims 1 to 4.
CN202310737245.1A 2023-06-21 2023-06-21 Data dictionary version management method, device, computer equipment and storage medium Active CN116501375B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310737245.1A CN116501375B (en) 2023-06-21 2023-06-21 Data dictionary version management method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310737245.1A CN116501375B (en) 2023-06-21 2023-06-21 Data dictionary version management method, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN116501375A CN116501375A (en) 2023-07-28
CN116501375B true CN116501375B (en) 2024-02-23

Family

ID=87325024

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310737245.1A Active CN116501375B (en) 2023-06-21 2023-06-21 Data dictionary version management method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116501375B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102591666A (en) * 2012-01-04 2012-07-18 浪潮集团山东通用软件有限公司 Metadata management method for version of hierarchy structure
CN105975258A (en) * 2016-04-27 2016-09-28 中国银行股份有限公司 Data dictionary management method and system
CN109462640A (en) * 2018-10-29 2019-03-12 上海掌门科技有限公司 A kind of metadata synchronization method, data terminal, interactive system and medium
CN109829012A (en) * 2018-12-13 2019-05-31 山东亚华电子股份有限公司 The synchronous method and apparatus of data
CN116257636A (en) * 2023-02-02 2023-06-13 新奥数能科技有限公司 Unified management method and device for enumerated data dictionary, electronic equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7236993B2 (en) * 2003-04-16 2007-06-26 Oracle International Corporation On-demand multi-version denormalized data dictionary to support log-based applications
US10572275B2 (en) * 2017-06-15 2020-02-25 Microsoft Technology Licensing, Llc Compatible dictionary layout

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102591666A (en) * 2012-01-04 2012-07-18 浪潮集团山东通用软件有限公司 Metadata management method for version of hierarchy structure
CN105975258A (en) * 2016-04-27 2016-09-28 中国银行股份有限公司 Data dictionary management method and system
CN109462640A (en) * 2018-10-29 2019-03-12 上海掌门科技有限公司 A kind of metadata synchronization method, data terminal, interactive system and medium
CN109829012A (en) * 2018-12-13 2019-05-31 山东亚华电子股份有限公司 The synchronous method and apparatus of data
CN116257636A (en) * 2023-02-02 2023-06-13 新奥数能科技有限公司 Unified management method and device for enumerated data dictionary, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN116501375A (en) 2023-07-28

Similar Documents

Publication Publication Date Title
US11461294B2 (en) System for importing data into a data repository
US11360950B2 (en) System for analysing data relationships to support data query execution
US11409764B2 (en) System for data management in a large scale data repository
US9280569B2 (en) Schema matching for data migration
US9229971B2 (en) Matching data based on numeric difference
CN111459985B (en) Identification information processing method and device
US10339038B1 (en) Method and system for generating production data pattern driven test data
US8335981B2 (en) Metadata creation
US20090024997A1 (en) Batch processing apparatus
CN114416703A (en) Method, device, equipment and medium for automatically monitoring data integrity
CN116501375B (en) Data dictionary version management method, device, computer equipment and storage medium
CN108804561B (en) Data synchronization method and device
CN111611230A (en) Method and device for establishing main data system, computer equipment and storage medium
AU2019327667A1 (en) Data deduplication and data merging
CN110062112A (en) Data processing method, device, equipment and computer readable storage medium
US11829379B2 (en) Methods and systems of a matching platform for entitites
CN115905371A (en) Data trend analysis method, device and equipment and computer readable storage medium
CN115510289A (en) Data cube configuration method and device, electronic equipment and storage medium
CN114860727A (en) Zipper watch updating method and device
CN112559641A (en) Processing method and device of pull chain table, readable storage medium and electronic equipment
US10761861B1 (en) System, method, and computer program for event stream modification
US11948055B1 (en) Methods and computer program products for clustering records using imperfect rules
CN114579179A (en) Version synchronization method and device, computer equipment and storage medium
CN117493333A (en) Data archiving method and device, electronic equipment and storage medium
CN115629958A (en) Universal field level automatic checking method and device for different service interfaces

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant