CN116932515A - Data management method, device, equipment and medium for realizing data decoupling of production system - Google Patents

Data management method, device, equipment and medium for realizing data decoupling of production system Download PDF

Info

Publication number
CN116932515A
CN116932515A CN202310960161.4A CN202310960161A CN116932515A CN 116932515 A CN116932515 A CN 116932515A CN 202310960161 A CN202310960161 A CN 202310960161A CN 116932515 A CN116932515 A CN 116932515A
Authority
CN
China
Prior art keywords
data
dictionary
production system
decoupling
production
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310960161.4A
Other languages
Chinese (zh)
Inventor
张霜洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Health Online Technology Development Co ltd
Original Assignee
Beijing Health Online Technology Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Health Online Technology Development Co ltd filed Critical Beijing Health Online Technology Development Co ltd
Priority to CN202310960161.4A priority Critical patent/CN116932515A/en
Publication of CN116932515A publication Critical patent/CN116932515A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a data management method for realizing data decoupling of a production system, which comprises the following steps of constructing a data tracking server, and classifying production data of the production system through the data tracking server; evaluating the quality of a data dictionary through granularity and coincidence degree of dictionary table value field data; carrying out main index evaluation on the production data, and constructing a message platform based on the main index evaluation result and the evaluation result of the data dictionary; and according to the data tracking result of the data tracking server on the production system, the production system is withdrawn or accepted by the message platform. The application can realize quick informatization transformation by constructing a method of decoupling new and old system data by a message platform, and simultaneously, the digitalized process can be matched with the service to the greatest extent so as to achieve the unification of the data and the service.

Description

Data management method, device, equipment and medium for realizing data decoupling of production system
Technical Field
The present application relates to the field of internet technologies, and in particular, to a data management method, apparatus, device, and storage medium for implementing data decoupling in a production system.
Background
Specific "chimney phenomenon" for industrial product informatization: by "chimney phenomenon" is meant that in the history of informative construction of an enterprise or entity, there are a number of software systems suitable for production, partly of which are responsible for the same production function but are constructed in different periods and partly of which are constructed in similar periods to each other to be responsible for different functions.
The "chimney phenomenon" presents a significant obstacle to enterprise management in the digital transformation and comprehensive digital stages: the informatization stage of the enterprise or the organization is to go straight off the enterprise line or go into the flow of the verbal communication, and fall into the system. After the informatization system is built for many years, information deposited by the workflow of an enterprise or a unit is stored in the informatization system, the enterprise or the unit wants to reform and promote the existing business flow through the work information deposited by history, and the enterprise enters a digital transformation stage. After the business process can be continuously modified by the enterprise through the work information of the historical precipitation step by step, the process is hoped to be accelerated, the business process can be monitored through recent data, the real-time improvement of the business process is realized, and the enterprise enters a comprehensive digital stage. However, the barriers facing enterprises in the digitizing stage exist in several aspects:
the incomplete use of the information system causes that system data can not reflect the actual situation of the service, and the data application can not realize the function of data driving: the purpose of the system informatization construction is to move the off-line flow to the line, but the situations of large policy guiding influence, trivial and undefined service demand, high requirement of industrial software on professional knowledge understanding, weak technical capability of a software system service line or manufacturer and the like exist in the construction process, so that the situation that the software cannot be used can be found when the software is completely developed and falls to the ground.
The informatization systems are all combat, and can not be combined and counted from the time dimension to form a multi-source heterogeneous state on the service level, so that the multi-source heterogeneous state on the data level can not be solved by the technical means: the informatization construction of the information system is not performed by a split system, so that a service flow exists, and a plurality of systems exist respectively, but the systems are developed by different manufacturers and different time periods, but as the service is developed, the same service flow is likely to have progressive changes in different time periods, namely a plurality of service flows or nodes in the flow, the definition of different time periods is different, how to determine that matters represent the same meaning during the digitizing stage becomes impossible, the definition of metadata in the digitizing is trapped in the multi-dimensional situation of the matters in the system, and finally cannot be developed.
Disclosure of Invention
The application mainly aims to provide a data management method, a device, equipment and a storage medium for realizing data decoupling of a production system, and aims to solve the technical problems.
In order to achieve the above purpose, the present application provides a data management method for implementing data decoupling of a production system.
The data management method for realizing data decoupling of the production system comprises the following steps:
constructing a data tracking server, and classifying production data of a production system through the data tracking server;
evaluating the quality of a data dictionary through granularity and coincidence degree of dictionary table value field data;
carrying out main index evaluation on the production data, and constructing a message platform based on the main index evaluation result and the evaluation result of the data dictionary;
and according to the data tracking result of the data tracking server on the production system, the production system is withdrawn or accepted by the message platform.
In an embodiment, the production data includes table structure data, table data amounts, view tracking data, stored procedure data.
In one embodiment, the step of evaluating the quality of the data dictionary by granularity and coincidence of the dictionary table value field data includes:
the data dictionary is evaluated by comparing the value fields in the dictionary table with the value fields of the data dictionary.
In one embodiment, the step of classifying, by the data tracking server, production data of the production system includes:
and classifying the table structure data in the production data into different classification groups, wherein the classification groups comprise a discard table, a dictionary table, a conventional service table, an abnormal service table, a view function entity service table, a statistical table, an intermediate table and a configuration table.
In an embodiment, after the step of evaluating the quality of the data dictionary through granularity and coincidence degree of the dictionary table value field data, the data management method for implementing data decoupling of the production system further includes: and correcting the data dictionary.
In one embodiment, the step of correcting the data dictionary includes: all the value ranges are divided according to the grades, and the last stage is corresponding to the upper stage step by step.
In an embodiment, after the step of performing the main index evaluation on the production data and building the message platform based on the main index evaluation result and the evaluation result of the data dictionary, the data management method for implementing data decoupling of the production system further includes:
creating a main index through information in the existing data, merging the events which are the same originally through rule calculation, marking the same event with a uniform mark, and using historical data before the main index is created as a forced technical main index.
In addition, in order to achieve the above object, the present application further provides a data management device for implementing data decoupling of a production system, the data management device for implementing data decoupling of a production system includes:
the data tracking module is used for constructing a data tracking server and classifying production data of the production system through the data tracking server;
the first evaluation module is used for evaluating the quality of the data dictionary through granularity and coincidence degree of the dictionary table value range data;
the second evaluation module is used for carrying out main index evaluation on the production data;
the building module is used for building a message platform based on the main index evaluation result and the evaluation result of the data dictionary;
and the processing module is used for canceling the production system or adapting the production system by the message platform according to the data tracking result of the data tracking server on the production system.
In addition, to achieve the above object, the present application also provides a data management apparatus for implementing data decoupling of a production system, the data management apparatus for implementing data decoupling of a production system including: the system comprises a memory, a processor and a data governance program stored on the memory and executable on the processor, wherein the data governance program when executed by the processor implements the steps of the data governance method as described above.
In addition, in order to achieve the above object, the present application also provides a computer-readable storage medium having stored thereon a data management program which, when executed by a processor, implements the steps of the data management method as set forth in any one of the above.
The application has the beneficial effects that: according to the data management method for realizing data decoupling of the production system, the data tracking server is built, and the production data of the production system are classified through the data tracking server; evaluating the quality of a data dictionary through granularity and coincidence degree of dictionary table value field data; carrying out main index evaluation on the production data, and constructing a message platform based on the main index evaluation result and the evaluation result of the data dictionary; and according to the data tracking result of the data tracking server on the production system, the production system is withdrawn or accepted by the message platform. In the technical scheme, the method for decoupling new and old system data by constructing the message platform can enable quick informatization transformation to be possible, and meanwhile, the digitizing process can be matched with the service to the greatest extent so as to achieve data and service unification.
Drawings
FIG. 1 is a schematic diagram of a device of a hardware operating environment according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of a first embodiment of a data management method for implementing data decoupling of a production system according to the present application;
FIG. 3 is a flow chart of a second embodiment of a data management method for implementing data decoupling in a production system according to the present application.
The achievement of the objects, functional features and advantages of the present application will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
As shown in fig. 1, fig. 1 is a schematic diagram of a terminal structure of a hardware running environment according to an embodiment of the present application.
The terminal of the embodiment of the application can be a PC, or can be a mobile terminal device with a display function, such as a smart phone, a tablet personal computer, an electronic book reader, an MP3 (Moving Picture Experts Group Audio Layer III, dynamic image expert compression standard audio layer 3) player, an MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image expert compression standard audio layer 4) player, a portable computer and the like.
As shown in fig. 1, the terminal may include: a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a stable memory (non-volatile memory), such as a disk memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
Optionally, the terminal may also include a camera, an RF (Radio Frequency) circuit, a sensor, an audio circuit, a WiFi module, and so on. Among other sensors, such as light sensors, motion sensors, and other sensors. Specifically, the light sensor may include an ambient light sensor that may adjust the brightness of the display screen according to the brightness of ambient light, and a proximity sensor that may turn off the display screen and/or the backlight when the mobile terminal moves to the ear. As one of the motion sensors, the gravity acceleration sensor can detect the acceleration in all directions (generally three axes), and can detect the gravity and the direction when the mobile terminal is stationary, and the mobile terminal can be used for recognizing the gesture of the mobile terminal (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometer and knocking), and the like; of course, the mobile terminal may also be configured with other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, an infrared sensor, and the like, which are not described herein.
It will be appreciated by those skilled in the art that the terminal structure shown in fig. 1 is not limiting of the terminal and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
As shown in fig. 1, an operating system, a network communication module, a user interface module, and a data management program may be included in a memory 1005, which is a type of computer storage medium.
In the terminal shown in fig. 1, the network interface 1004 is mainly used for connecting to a background server and performing data communication with the background server; the user interface 1003 is mainly used for connecting a client (user side) and performing data communication with the client; and the processor 1001 may be configured to call a data governance program stored in the memory 1005 and perform the following operations:
constructing a data tracking server, and classifying production data of a production system through the data tracking server;
evaluating the quality of a data dictionary through granularity and coincidence degree of dictionary table value field data;
carrying out main index evaluation on the production data, and constructing a message platform based on the main index evaluation result and the evaluation result of the data dictionary;
and according to the data tracking result of the data tracking server on the production system, the production system is withdrawn or accepted by the message platform.
Further, the processor 1001 may call a data governance program stored in the memory 1005, and further perform the following operations:
the data dictionary is evaluated by comparing the value fields in the dictionary table with the value fields of the data dictionary.
Further, the processor 1001 may call a data governance program stored in the memory 1005, and further perform the following operations:
and classifying the table structure data in the production data into different classification groups, wherein the classification groups comprise a discard table, a dictionary table, a conventional service table, an abnormal service table, a view function entity service table, a statistical table, an intermediate table and a configuration table.
Further, the processor 1001 may call a data governance program stored in the memory 1005, and further perform the following operations:
and correcting the data dictionary.
Further, the processor 1001 may call a data governance program stored in the memory 1005, and further perform the following operations:
all the value ranges are divided according to the grades, and the last stage is corresponding to the upper stage step by step.
Further, the processor 1001 may call a data governance program stored in the memory 1005, and further perform the following operations:
creating a main index through information in the existing data, merging the events which are the same originally through rule calculation, marking the same event with a uniform mark, and using historical data before the main index is created as a forced technical main index.
The specific embodiment of the application data storage device of the present application is substantially the same as each embodiment of the data storage method described below, and will not be described herein.
Referring to fig. 2, a first embodiment of the present application provides a data management method for implementing data decoupling of a production system, where the data management method for implementing data decoupling of a production system includes:
and step S10, a data tracking server is built, and the production data of the production system are classified through the data tracking server.
Firstly, a data tracking server is built, and a data tracking database, a data ETL tool, a timing task starting tool and a data tracking result visualization tool are installed.
The production data includes table structure data, table data volume, view tracking data, stored procedure data.
Table structure tracking: the query statement for querying the whole library Table structure is designed, the Table structures of the production systems A1, A2, A3, … … and An are respectively stored in a data tracking database through a data ETL tool, and the Table names are identified by the date and time (time and second are required to be accurate) when the Table structures are queried by +A1+, such as Table_A1_20230612103021.
Table data size tracking: the query statement for querying the table rows of the whole library is designed, the table structures of the production systems A1, A2, A3, … … and An are respectively stored in a data tracking database through a data ETL tool, and the table names are identified by the table row +A1+ and the date and time (time and second are required to be accurate) when the query is performed, such as TableRow_A1_20230612103021.
View tracking: and designing query sentences corresponding to the View and the table of the query whole library, and respectively storing the View and the table corresponding to the production systems A1, A2, A3, … … and An in a data tracking database through a data ETL tool, wherein the table name is identified by the date and time (time and minute seconds are needed) when the View +A1+ is queried, such as View_A1_20230612103021.
And (3) tracking a storage process: the method comprises the steps of designing a query whole set of storage processes and query sentences corresponding to tables, and storing the table structures of production systems A1, A2, A3, … … and An in a data tracking database through a data ETL tool, wherein the table names are identified by the date and time (time and minute) when the view +A1+ is queried, such as Procedure_A1_20230612103021.
Data tracking timing task setting: according to the business situation, the timing task is designed, and the production data tracking situation (the table structure tracking, the table data volume tracking, the view tracking and the storage process tracking) is stored in a tracking database at regular intervals.
And (3) data tracking result visualization design: and carrying out data visualization design according to the table structure, the table data amount, the view and the change information of the storage process in the tracking database to form a visualization chart.
And S20, evaluating the quality of the data dictionary through granularity and coincidence degree of the dictionary table value domain data.
Quality condition of data dictionary: by comparing whether the value fields in the dictionary are mutually independent and completely exhaustive, if the content is mutually independent and completely exhaustive, the dictionary is a good-quality data dictionary, otherwise, the dictionary is a poor-quality data dictionary. By way of example, if a dictionary such as an age dictionary has a range of 0 years old, 1 year old, 2 years old, … … is a good dictionary, because a person 0 years old is unlikely to be 1 year old, which is independent of each other, and is completely exhaustive of how many years old are in the range. However, if the age dictionary is for teenagers, young adults, middle-aged and elderly people, the age dictionary is not a good dictionary, because the granularity of the teenagers and young adults is small, the ages of 12 and 13 are all the teenagers and young adults are indistinct, and the people can have overlapping and are not independent from each other when understanding differently.
The use case of the data dictionary: and (3) matching the value fields in the data dictionary tables with the conventional service tables, the abnormal service tables and the view function entity service tables, checking whether the value fields of the dictionaries in the 3 service tables are consistent with the value field content of the data dictionary, if so, judging that the data dictionary is a data dictionary which is normally used, and if the value fields which are not in any one of the dictionary tables appear in the value fields of the dictionary tables in the 3 service tables, then evaluating the data dictionary which is abnormally used.
Repeating the data dictionary: by comparing whether the value fields of the dictionaries are repeated in a large range, if the content is in a large repeated range and is used for different conventional service tables, abnormal service tables and view function entity service tables, the data dictionary is a repeated data dictionary.
And step S30, carrying out main index evaluation on the production data, and building a message platform based on the main index evaluation result and the evaluation result of the data dictionary.
Evaluation of the master index technically: by designing the main index field in the table structure, we can determine whether the data is a technically mandatory main index or a technically non-mandatory main index.
Evaluation of the primary index on traffic: by actual evaluation of the service, we can determine whether the data, the primary index, is the mandatory primary index on the service (e.g. the identification number is the mandatory primary index on the typical service).
In addition, the application can evaluate metadata definition and data chain breakage.
Data plane definition evaluation: by looking at the data in the business table, see if each data represents a consistent meaning, if there is a data sheet for the article (e.g., a cell phone number is stored in an address field), the data plane is considered to be undefined.
Business layer definition evaluation: by checking the three dimensions of the data in the business table, the system foreground interface and the actual business, whether the three dimensions are unified or not is judged, and if the three dimensions are not unified (for example, five levels of province, city, county, street, specific address and the like are definitely adopted in the address field, and some specific addresses of the data are accompanied by province, city, county and street information, and some specific addresses are not accompanied by the province, the city, the county and the street information), the business layer is considered to be undefined.
Data strand break evaluation: the data increment arrangement of the same business in the business table is compared with the business links, and if some business environments are found not to exist in the systems, for example, the business of two systems is operated by a person on line (for example, a result is downloaded from one system and filled in or imported into the other system), the data link fracture can be determined.
And step S40, according to the data tracking result of the data tracking server on the production system, the production system is withdrawn or accepted by the message platform.
In the embodiment, a method of decoupling new and old system data by constructing a message platform is used, so that quick informatization transformation becomes possible, and meanwhile, the digitizing process can be matched with the service to the greatest extent so as to achieve data and service unification.
Meanwhile, the feasibility of data management is enhanced, the mode of multiplexing old service by using a message platform is adopted, the problem of management dilemma caused by hard landing of the old service is finally closed by multiplexing self service data by old service line personnel is fundamentally solved, and in addition, the communication problem caused by analysis of service understanding by service and technical personnel is avoided by analyzing the service mode through objective data message.
The problem of the data layer is exposed to the maximum extent, and the technical work, the data work and the business work are stripped. The technical staff is focused on the service requirement to develop the new function of the system, so that the problem of historical data deposited by the old function can be avoided, and the new architecture and the table structure can be designed more flexibly. The data personnel bear the conversion of new and old data, and the fitting difficulty of the data line and the service line. The business personnel can concentrate on the functions which the business itself needs to realize, and the problems caused by the prior business function realization are not needed to be considered.
In terms of improving the availability of data, the construction of a message platform can be used for multidimensional display when the data is forced to be displayed in a multidimensional form (the data structure and the service collide, and the storage form needs to be converted for use), and the correctness of each dimension is not required, so that the data architecture can be selected when the data is applied in the future. After the data application is stable, the data architecture is ensured to be correct, and data requirements are put forward on the production system, including data decoupling, service decoupling and the like, so that the data iteration can be ensured to be in a usable range.
By constructing a method for decoupling new and old system data by a message platform, rapid informatization transformation becomes possible, and meanwhile, the digitalized process can be matched with the service to the greatest extent so as to achieve the unification of the data and the service.
In the above embodiment, after the message platforms overlap, the evaluation can be performed in parallel and by the message platforms and the production system in conjunction.
Exemplary:
supplementary evaluation: it is checked whether the operational data in the data tracking database can supplement the data of the production system.
Correlation evaluation: it is checked whether the operational data in the data tracking database can be used with the production system data.
Data volume evaluation: the incremental status of the operational data in the data tracking database is reviewed.
Program modification workload assessment: and evaluating the program reconstruction workload according to the supplement, the relevance and the data volume evaluation.
Further, the step of classifying, by the data tracking server, the production data of the production system includes:
and classifying the table structure data in the production data into different classification groups, wherein the classification groups comprise a discard table, a dictionary table, a conventional service table, an abnormal service table, a view function entity service table, a statistical table, an intermediate table and a configuration table.
Disposal table: table dictionary tables whose table number is unchanged from year to year and which are not present in table structure references: tables whose number of data varies throughout the year and which are incorporated in table structure references
Conventional service table: table with table number changing in increment all year round, not appearing in view and storage process and relatively stable table structure
Abnormal business table: table with increased or decreased table number, no appearance in view and storage process and relatively stable table structure
View function entity service table: the number of table rows changes in increment throughout the year, and the field quantity is very large, and the table rows appear in the storage process, and then the table rows are found to be in large-batch increment along with the execution time of the storage process
Statistics table: the number of the table is changed in an annual increment, but the field quantity is not large, the table appears in the storage process, and then the table is found to be in a large batch increment along with the execution time of the storage process
Intermediate table and configuration table: the number of the table lines has no obvious change throughout the year, the field quantity is small, and the service significance is not great after manual investigation.
Further, after the step of evaluating the quality of the data dictionary through granularity and coincidence degree of the dictionary table value domain data, the data management method for implementing data decoupling of the production system further includes: and correcting the data dictionary.
Specifically, the value fields in the data dictionary are not clearly defined, which means that when a service occurs, the state of the service may be marked by more than one, but only one option can be selected to fill in reality, so that when the actual service occurs, an operator of the system can only randomly select one, and therefore, the statistics cannot be completed. For example, the color dictionary has two value fields of "red" and "dark red", and in fact "dark red" is also one of the two value fields of "red", so when we want to count red, the individual count value field "red" is incomplete. Two approaches can be taken for such corrections:
changing the first-level dictionary table into a multi-level dictionary table: all the value fields are divided according to the grades, the last stage is corresponding to the last stage step by step upwards, the red in the example is the first-stage value field, the deep red is the second-stage value field, the deep red and the red are corresponding, and data related to the deep red are automatically carried when the red is counted in the future. The method is suitable for the situation that the service and the dictionary are relatively clear, and the value range can be exhausted, and the situation that the existing value range is not collided when a new value range is added in the future.
Changing the dictionary table into a tag table: and changing the value fields of all dictionary tables into tag tables, and changing single selection into multiple selection by a service layer. As in the above examples, the "red" and "dark red" are all labels, when the business occurs, we can put the two labels of "red" and "dark red" on things, and the expression is "dark red" and also is the meaning of "red", and in future statistics, the statistical information of "red" will also contain "dark red". The method is suitable for the situations that the business and the dictionary are relatively fuzzy, the value range cannot be exhausted temporarily, the value range needs to be increased anytime and anywhere, and the data can not be recorded simply and conveniently under the condition that the increased value range and the existing value range are repeated.
Correction of data dictionary usage problems: the fact that the value fields in the digital dictionary cannot cover the value fields in the service table means that two possibilities are possible, one possibility is that the field in the history service is not selected for the value fields of the dictionary, but is freely filled in, and the other possibility is that a dictionary exists in the history, but in the development process, the history dictionary is revised, the original dictionary is not reserved, namely, the new dictionary table is revised based on the original dictionary table, but the data of the original dictionary table is not reserved. For such corrections we can take two approaches in succession:
after the historical data are integrated, mapping is carried out with the existing new dictionary value domain: after all the value fields in the service table are exhausted, mapping is carried out with the existing dictionary, and statistics under the existing caliber is realized.
Version management is entered from the past dictionary: when the dictionary is corrected in the future, the existing version is reserved, and the historical version and the dictionary value domain of the existing version are mapped, so that statistics under all dictionary apertures are realized.
Correction of data dictionary repetition problems: when the existing multiple dictionaries are used for multiple service tables and represent the same statistical caliber, statistical confusion often occurs, which is a phenomenon of inconsistent data statistical caliber. Such situations present two possibilities, one is that the statistics of the data are multidimensional per se, but the definition is not clear due to the fact that the multidimensional meanings are similar, such as a "department dictionary", "subject dictionary" and a "professional dictionary" are all defined as a "professional dictionary", and the other is that the statistics cannot be performed due to the fact that part of matters are buried in depth historically or until now, such as the fact that the country explicitly requires that a hospital cannot perform "statistics". For such corrections we can take two approaches:
for data that can be counted, we split the dictionary tables with similar meaning and make clear definitions in the business table fields. And no processing can be performed on data which definitely cannot be counted.
Referring to fig. 3, a second embodiment of the present application provides a data management method for implementing data decoupling of a production system, where after the step of building a message platform based on a main index evaluation result and an evaluation result of a data dictionary by performing main index evaluation on production data, the data management method for implementing data decoupling of the production system further includes:
step S50, creating a main index through information in the existing data, merging the events which are the same originally through rule calculation, marking the same event with a uniform mark, and taking history data before the main index is created as a forced technical main index.
When the technical non-forced main index is met, the main index is created through information in the existing data, the identical events are combined through rule calculation, uniform identification is marked on the identical events, and the history data before the main index is created is used as the forced technical main index.
Correction of the forced business main index: after the forced technical master index is generated, we find the conflict value of the history data. In the future, the conflict value of the historical data needs to be avoided, forced main index setting needs to be carried out on the service, for example, in the historical data, we calculate that two users are actually one person through the rule that the mobile phone number and the name are the same, but the addresses of the two user rules are two, so that the mode of taking the mobile phone number as the main index needs to be designed on the service, the situation that the same mobile phone number can register two users is avoided, and the more forced identity card number can also be introduced as the main index of personnel.
In the above embodiments, the use is made of the Nginx-Kafka-Clickhouse technology, which is characterized by the following: compared with the traditional Apache open source technology, the Kafka-Clickhouse technology builds a message platform, is simple to develop and has larger data use degree, buried data can be directly stored in a Clickhouse OLAP database in a falling mode through a Kafka engine of the Clickhouse, the falling data can be directly used with production data, intermediate steps brought by file storage are reduced, and offline data analysis, real-time data analysis and production data analysis in the message platform are unified.
Kafka message format example:
Kafka SETTINGS
kafka_broker_list='localhost:9092',
kafka_topic_list='topic1,topic2',
kafka_group_name='group1',
kafka_format='JSONEachRow',
kafka_row_delimiter='\n',
kafka_schema=”,
kafka_num_consumers=2
clickhouse falls into tabular instance:
in addition, the embodiment of the application also provides a data management device for realizing data decoupling of the production system, which comprises:
the data tracking module is used for constructing a data tracking server and classifying production data of the production system through the data tracking server;
the first evaluation module is used for evaluating the quality of the data dictionary through granularity and coincidence degree of the dictionary table value range data;
the second evaluation module is used for carrying out main index evaluation on the production data;
the building module is used for building a message platform based on the main index evaluation result and the evaluation result of the data dictionary;
and the processing module is used for canceling the production system or adapting the production system by the message platform according to the data tracking result of the data tracking server on the production system.
It should be noted that, for convenience and brevity of description, the specific working process of the apparatus and each module described above may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
In addition, an embodiment of the present application also proposes a computer-readable storage medium, on which a data management program is stored, the data management program implementing the following operations when executed by a processor:
constructing a data tracking server, and classifying production data of a production system through the data tracking server;
evaluating the quality of a data dictionary through granularity and coincidence degree of dictionary table value field data;
carrying out main index evaluation on the production data, and constructing a message platform based on the main index evaluation result and the evaluation result of the data dictionary;
and according to the data tracking result of the data tracking server on the production system, the production system is withdrawn or accepted by the message platform.
Further, the data governance program when executed by the processor further performs the following operations: the data dictionary is evaluated by comparing the value fields in the dictionary table with the value fields of the data dictionary.
Further, the data governance program when executed by the processor further performs the following operations:
and classifying the table structure data in the production data into different classification groups, wherein the classification groups comprise a discard table, a dictionary table, a conventional service table, an abnormal service table, a view function entity service table, a statistical table, an intermediate table and a configuration table.
Further, the data governance program when executed by the processor further performs the following operations:
and correcting the data dictionary.
Further, the data governance program when executed by the processor further performs the following operations:
all the value ranges are divided according to the grades, and the last stage is corresponding to the upper stage step by step.
Further, the data governance program when executed by the processor further performs the following operations:
creating a main index through information in the existing data, merging the events which are the same originally through rule calculation, marking the same event with a uniform mark, and using historical data before the main index is created as a forced technical main index.
The specific embodiments of the computer readable storage medium of the present application are substantially the same as the embodiments of the above-mentioned application software security hole detection method, and are not described herein.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The foregoing embodiment numbers of the present application are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present application.
The foregoing description is only of the preferred embodiments of the present application, and is not intended to limit the scope of the application, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (10)

1. The data management method for realizing the data decoupling of the production system is characterized by comprising the following steps of:
constructing a data tracking server, and classifying production data of a production system through the data tracking server;
evaluating the quality of a data dictionary through granularity and coincidence degree of dictionary table value field data;
carrying out main index evaluation on the production data, and constructing a message platform based on the main index evaluation result and the evaluation result of the data dictionary;
and according to the data tracking result of the data tracking server on the production system, the production system is withdrawn or accepted by the message platform.
2. The data governance method for implementing data decoupling of a production system of claim 1, wherein said production data comprises table structure data, table data volume, view trace data, stored process data.
3. The method for data governance implementing data decoupling in a production system of claim 1, wherein said step of evaluating data dictionary quality by granularity and overlap ratio of dictionary table value range data comprises:
the data dictionary is evaluated by comparing the value fields in the dictionary table with the value fields of the data dictionary.
4. A data governance method for implementing data decoupling of a production system according to claim 3, wherein said step of classifying production data of the production system by the data tracking server comprises:
and classifying the table structure data in the production data into different classification groups, wherein the classification groups comprise a discard table, a dictionary table, a conventional service table, an abnormal service table, a view function entity service table, a statistical table, an intermediate table and a configuration table.
5. The data governance method for implementing data decoupling in a production system of claim 3, wherein after said step of evaluating data dictionary quality via granularity and overlap ratio of dictionary table value range data, said data governance method for implementing data decoupling in a production system further comprises: and correcting the data dictionary.
6. The method for data governance implementing data decoupling in a production system of claim 5, wherein said step of modifying a data dictionary comprises: all the value ranges are divided according to the grades, and the last stage is corresponding to the upper stage step by step.
7. The data governance method for implementing data decoupling of a production system according to claim 1, wherein after the step of constructing a message platform based on the main index evaluation result and the evaluation result of the data dictionary by performing main index evaluation on the production data, the data governance method for implementing data decoupling of the production system further comprises:
creating a main index through information in the existing data, merging the events which are the same originally through rule calculation, marking the same event with a uniform mark, and using historical data before the main index is created as a forced technical main index.
8. A data governance device for implementing data decoupling of a production system, the data governance device for implementing data decoupling of a production system comprising:
the data tracking module is used for constructing a data tracking server and classifying production data of the production system through the data tracking server;
the first evaluation module is used for evaluating the quality of the data dictionary through granularity and coincidence degree of the dictionary table value range data;
the second evaluation module is used for carrying out main index evaluation on the production data;
the building module is used for building a message platform based on the main index evaluation result and the evaluation result of the data dictionary;
and the processing module is used for canceling the production system or adapting the production system by the message platform according to the data tracking result of the data tracking server on the production system.
9. A data governance device for implementing data decoupling of a production system, the data governance device for implementing data decoupling of a production system comprising: a memory, a processor and a data governance program stored on the memory and executable on the processor, the data governance program when executed by the processor implementing the steps of the data governance method of any of claims 1 to 7.
10. A computer readable storage medium, wherein a data governance program is stored on the computer readable storage medium, which when executed by a processor, implements the steps of the data governance method of any of claims 1 to 7.
CN202310960161.4A 2023-08-01 2023-08-01 Data management method, device, equipment and medium for realizing data decoupling of production system Pending CN116932515A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310960161.4A CN116932515A (en) 2023-08-01 2023-08-01 Data management method, device, equipment and medium for realizing data decoupling of production system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310960161.4A CN116932515A (en) 2023-08-01 2023-08-01 Data management method, device, equipment and medium for realizing data decoupling of production system

Publications (1)

Publication Number Publication Date
CN116932515A true CN116932515A (en) 2023-10-24

Family

ID=88394093

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310960161.4A Pending CN116932515A (en) 2023-08-01 2023-08-01 Data management method, device, equipment and medium for realizing data decoupling of production system

Country Status (1)

Country Link
CN (1) CN116932515A (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160246823A1 (en) * 2015-02-20 2016-08-25 Metropolitan Life Insurance Co. System and method for enterprise data quality processing
CN111190881A (en) * 2019-11-13 2020-05-22 深圳市华傲数据技术有限公司 Data management method and system
CN112395325A (en) * 2020-11-27 2021-02-23 广州光点信息科技有限公司 Data management method, system, terminal equipment and storage medium
CN112396404A (en) * 2020-11-27 2021-02-23 广州光点信息科技有限公司 Data center system
CN112699175A (en) * 2021-01-15 2021-04-23 广州汇智通信技术有限公司 Data management system and method thereof
CN114298550A (en) * 2021-12-28 2022-04-08 安徽海螺信息技术工程有限责任公司 Method for treating cement production operation data
CN114996247A (en) * 2022-04-22 2022-09-02 华能澜沧江水电股份有限公司 Large-scale drainage basin hydropower enterprise data management method
KR20220127443A (en) * 2021-03-11 2022-09-20 김기창 Data architecture management system
CN115391332A (en) * 2022-07-15 2022-11-25 生命奇点(北京)科技有限公司 Data governance method, device and computer storage medium
CN115481117A (en) * 2022-09-23 2022-12-16 中孚信息股份有限公司 Asset data processing method and system based on metadata management
CN115511434A (en) * 2022-08-23 2022-12-23 杭州硕磐智能科技有限公司 Enterprise data management method and system based on business metadata drive
US11544266B1 (en) * 2019-12-20 2023-01-03 meZocliq LLC Methods and systems for efficiently and rapidly generating highly customized cloud-based enterprise software applications
CN115731066A (en) * 2022-12-08 2023-03-03 江苏达科数智技术有限公司 Data management system and method for service audit platform

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160246823A1 (en) * 2015-02-20 2016-08-25 Metropolitan Life Insurance Co. System and method for enterprise data quality processing
CN111190881A (en) * 2019-11-13 2020-05-22 深圳市华傲数据技术有限公司 Data management method and system
US11544266B1 (en) * 2019-12-20 2023-01-03 meZocliq LLC Methods and systems for efficiently and rapidly generating highly customized cloud-based enterprise software applications
CN112395325A (en) * 2020-11-27 2021-02-23 广州光点信息科技有限公司 Data management method, system, terminal equipment and storage medium
CN112396404A (en) * 2020-11-27 2021-02-23 广州光点信息科技有限公司 Data center system
CN112699175A (en) * 2021-01-15 2021-04-23 广州汇智通信技术有限公司 Data management system and method thereof
KR20220127443A (en) * 2021-03-11 2022-09-20 김기창 Data architecture management system
CN114298550A (en) * 2021-12-28 2022-04-08 安徽海螺信息技术工程有限责任公司 Method for treating cement production operation data
CN114996247A (en) * 2022-04-22 2022-09-02 华能澜沧江水电股份有限公司 Large-scale drainage basin hydropower enterprise data management method
CN115391332A (en) * 2022-07-15 2022-11-25 生命奇点(北京)科技有限公司 Data governance method, device and computer storage medium
CN115511434A (en) * 2022-08-23 2022-12-23 杭州硕磐智能科技有限公司 Enterprise data management method and system based on business metadata drive
CN115481117A (en) * 2022-09-23 2022-12-16 中孚信息股份有限公司 Asset data processing method and system based on metadata management
CN115731066A (en) * 2022-12-08 2023-03-03 江苏达科数智技术有限公司 Data management system and method for service audit platform

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
李雨霏;: "人工智能在数据治理中的应用", 信息通信技术与政策, no. 05, 15 May 2019 (2019-05-15), pages 30 - 34 *
王强;易应萍;: "临床医疗大数据治理和应用", 医学信息学杂志, no. 08, 23 August 2018 (2018-08-23), pages 6 - 10 *
用友平台与数据智能团队著: "一本书讲透数据治理 战略、方法、工具与实践", 28 February 2022, 北京:机械工业出版社, pages: 402 - 406 *

Similar Documents

Publication Publication Date Title
CN109145215B (en) Network public opinion analysis method, device and storage medium
US11868411B1 (en) Techniques for compiling and presenting query results
US20200273046A1 (en) Regulatory compliance assessment and business risk prediction system
US20160307210A1 (en) Recommending User Actions Based on Collective Intelligence for a Multi-Tenant Data Analysis System
CN111125343A (en) Text analysis method and device suitable for human-sentry matching recommendation system
US20160378859A1 (en) Method and system for parsing and aggregating unstructured data objects
JP2010079657A (en) Information processor, information processing method, and program
US20210357591A1 (en) Systems and methods of artificially intelligent sentiment analysis
CN110378516B (en) Analyst portrait generation method, analyst portrait generation device, analyst portrait generation equipment and computer-readable storage medium
CN109934631A (en) Question and answer information processing method, device and computer equipment
CN109829033B (en) Data display method and terminal equipment
CN116932515A (en) Data management method, device, equipment and medium for realizing data decoupling of production system
US20180189699A1 (en) A method and system for locating regulatory information
KR100888329B1 (en) System and method for automatically detecting information in real-time using rule
CN112346938B (en) Operation auditing method and device, server and computer readable storage medium
KR20140073624A (en) Methods for competency assessment of corporation for global business
US20180189803A1 (en) A method and system for providing business intelligence
JP2019083076A (en) Evaluation device, evaluation method and evaluation program
US20240232538A9 (en) Systems and methods of artificially intelligent sentiment analysis
CN110688363B (en) Standardized processing method and system for data, electronic equipment and storage medium
US20230046539A1 (en) Method and system to align quantitative and qualitative statistical information in documents
US11250070B2 (en) Episode management device, episode management program, and episode management method
CN115169355A (en) Product configuration method, device, equipment and medium based on attribute level
CN117688190A (en) Data acquisition method, data acquisition apparatus, and computer-readable storage medium
CN113379309A (en) Merchant rating method, device, equipment and computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination