CN116910023A - Data management system - Google Patents
Data management system Download PDFInfo
- Publication number
- CN116910023A CN116910023A CN202310852654.6A CN202310852654A CN116910023A CN 116910023 A CN116910023 A CN 116910023A CN 202310852654 A CN202310852654 A CN 202310852654A CN 116910023 A CN116910023 A CN 116910023A
- Authority
- CN
- China
- Prior art keywords
- data
- data processing
- module
- service
- functions
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000013523 data management Methods 0.000 title claims abstract description 16
- 238000012545 processing Methods 0.000 claims abstract description 166
- 230000006870 function Effects 0.000 claims abstract description 73
- 238000013500 data storage Methods 0.000 claims abstract description 25
- 238000012544 monitoring process Methods 0.000 claims abstract description 23
- 238000013079 data visualisation Methods 0.000 claims abstract description 11
- 238000000034 method Methods 0.000 claims description 48
- 238000007726 management method Methods 0.000 claims description 18
- 238000006243 chemical reaction Methods 0.000 claims description 15
- 238000007405 data analysis Methods 0.000 claims description 15
- 238000003860 storage Methods 0.000 claims description 14
- 238000004458 analytical method Methods 0.000 claims description 13
- 238000004140 cleaning Methods 0.000 claims description 11
- 238000001514 detection method Methods 0.000 claims description 10
- 230000003993 interaction Effects 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 10
- 238000012216 screening Methods 0.000 claims description 7
- 230000004382 visual function Effects 0.000 claims description 7
- 238000013496 data integrity verification Methods 0.000 claims description 6
- 238000000586 desensitisation Methods 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 6
- 238000005065 mining Methods 0.000 claims description 5
- 230000002688 persistence Effects 0.000 claims description 5
- 238000012795 verification Methods 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000013461 design Methods 0.000 abstract description 7
- 238000009826 distribution Methods 0.000 description 6
- 230000005540 biological transmission Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 230000002159 abnormal effect Effects 0.000 description 3
- 238000012550 audit Methods 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 230000005856 abnormality Effects 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 206010000117 Abnormal behaviour Diseases 0.000 description 1
- 230000002547 anomalous effect Effects 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 230000000873 masking effect Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000000275 quality assurance Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/252—Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/258—Data format conversion from or to a database
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/26—Visual data mining; Browsing structured data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6227—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/64—Protecting data integrity, e.g. using checksums, certificates or signatures
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computer Security & Cryptography (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- General Health & Medical Sciences (AREA)
- Computer Hardware Design (AREA)
- Software Systems (AREA)
- Quality & Reliability (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a data management system, which comprises: defining a data processing module; designing a data processing module; realizing a data processing module; an integrated data processing module; providing a data service interface module; a data quality monitoring module; a data security and privacy protection module; a data visualization module; the data storage and retrieval module can effectively define, design, realize, integrate, provide data service, monitor data quality, protect data safety and privacy, perform data visualization and provide data storage and retrieval functions through the cooperative work of the components, and improve the efficiency, consistency and reliability of data processing.
Description
Technical Field
The invention relates to a data management system.
Background
The middle-stage data processing system is to build a unified data processing platform in an enterprise, integrate the data processing functions of each service system and realize centralized management, sharing and multiplexing of data.
If the data system is not used in the middle station processing, enterprises can face the conditions of repeated development and maintenance work, data island and information isolation, data inconsistency, inefficiency and resource waste, difficult data quality assurance and lack of uniform data service interfaces. The data processing system of the middle platform can solve the problems, provides a unified data processing platform, realizes centralized management, sharing and multiplexing of data, and improves the efficiency, consistency and reliability of data processing.
Disclosure of Invention
The present invention is directed to a data management system, which solves the above-mentioned problems in the prior art.
In order to achieve the above purpose, the present invention provides the following technical solutions:
a data management system, comprising:
defining a data processing module: defining data processing tasks to be performed according to service requirements;
and (3) designing a data processing module: designing a data processing flow according to the defined data processing task;
realizing a data processing module: according to the requirements of data processing tasks and data processing flows, specific data processing functions are realized;
an integrated data processing module: integrating the realized data processing function module into a service center to realize independent operation and management of the data processing function;
providing a data service interface module: according to the service requirement, a data service interface is provided for a service system and other application systems, so that data sharing and application are realized;
and the data quality monitoring module is used for: monitoring the accuracy, integrity and consistency indexes of the data, and finding and processing the quality problems of the data;
and the data security and privacy protection module is used for: the security and privacy of the data in the processing process are ensured by integrating the data processing flow;
and a data visualization module: displaying the data processing result in the forms of a chart, a report form and an instrument panel, and providing rich visual functions and interaction modes;
and the data storage and retrieval module is used for: providing efficient data storage and retrieval functions, tightly integrated with data processing flows, and providing fast and reliable data access capabilities.
As a further scheme of the invention: the definition method for defining the data processing module comprises the following steps:
defining data processing tasks according to service requirements, wherein the data processing tasks comprise data acquisition, data cleaning, data conversion, data analysis and data application modules;
and (3) data acquisition: obtaining raw data from different data sources;
data cleaning: screening, de-duplication and filling missing value pretreatment operation is carried out on the data;
data conversion: converting the data format into a required form;
data analysis: carrying out statistics, mining and modeling analysis on the data;
data application: and applying the analysis result to the business scene.
As a further scheme of the invention: the design method of the design data processing module comprises the following steps:
designing a data processing flow according to the defined data processing task;
and (3) scheduling a data flow: determining the execution sequence and the dependency relationship of the data processing tasks, and ensuring the correct circulation of the data in the flow;
data service call: and calling corresponding data services, such as a data storage service and a data calculation service, according to the requirements.
As a further scheme of the invention: the method for integrating the data processing module comprises the following steps:
integrating the realized data processing function module into a service center;
independent operation and management of the data processing function are realized;
the business center is a unified platform for integrating functions and services of various business systems, including data processing functions.
As a further scheme of the invention: the method for providing the data service interface comprises the following steps:
providing data service interfaces for business systems and other application systems according to business requirements;
realizing the sharing and application of data;
other systems can conveniently call the data processing function, acquire the processed data result and apply the processed data result to own business scene.
As a further scheme of the invention: the method of the data quality monitoring module comprises the following functions:
monitoring the accuracy, integrity and consistency index of the data;
timely finding and processing the data quality problem;
the method comprises the functions of data anomaly detection, data consistency verification and data integrity verification.
As a further scheme of the invention: the method for the data security and privacy protection module comprises the following steps:
data encryption, access control and desensitization treatment measures;
the safety and privacy of the data are protected;
and the method is integrated with a data processing flow, so that the safety and privacy of the data in the processing process are ensured.
As a further scheme of the invention: the method of the data visualization module comprises the following functions:
displaying the data processing result in the form of a chart, a report form and an instrument panel;
providing rich visual functions and interaction modes;
meeting the requirements of different users.
As a further scheme of the invention: the method of the data storage and retrieval module comprises the following functions:
providing efficient data storage and retrieval functions;
the method comprises the functions of data persistence storage, index establishment and query optimization;
tightly integrated with the data processing flow, providing fast and reliable data access capabilities.
Compared with the prior art, the invention has the beneficial effects that:
the data processing function is decoupled from the service system by adopting the design idea of the service center, so that an independent service center is formed, independent operation and management of the data processing function are realized, the data processing efficiency and flexibility are improved, and meanwhile, the data processing cost and risk are reduced. In addition, the data flow scheduling and data service calling modes are adopted, so that data acquisition, processing, analysis and application are realized, and omnibearing data support is provided for enterprises. The method is suitable for various enterprises including electronic commerce enterprises, financial enterprises, medical enterprises and educational enterprises. The method can be customized and developed according to the characteristics and requirements of different enterprises, and meets the personalized business requirements of the enterprises. In addition, the method can also monitor and manage the data processing task and the data service interface in real time, thereby improving the reliability and stability of data processing and reducing the cost and risk of data processing.
Drawings
FIG. 1 is a block diagram of a data management system.
Description of the embodiments
The technical scheme of the patent is further described in detail below with reference to the specific embodiments.
Referring to fig. 1, a data management system includes:
defining a data processing module: defining data processing tasks to be performed according to service requirements;
and (3) designing a data processing module: designing a data processing flow according to the defined data processing task;
realizing a data processing module: according to the requirements of data processing tasks and data processing flows, specific data processing functions are realized;
an integrated data processing module: integrating the realized data processing function module into a service center to realize independent operation and management of the data processing function;
providing a data service interface module: according to the service requirement, a data service interface is provided for a service system and other application systems, so that data sharing and application are realized;
and the data quality monitoring module is used for: monitoring the accuracy, integrity and consistency indexes of the data, and finding and processing the quality problems of the data;
and the data security and privacy protection module is used for: the security and privacy of the data in the processing process are ensured by integrating the data processing flow;
and a data visualization module: displaying the data processing result in the forms of a chart, a report form and an instrument panel, and providing rich visual functions and interaction modes;
and the data storage and retrieval module is used for: providing efficient data storage and retrieval functions, tightly integrated with data processing flows, and providing fast and reliable data access capabilities.
The definition method for defining the data processing module comprises the following steps:
defining data processing tasks according to service requirements, wherein the data processing tasks comprise data acquisition, data cleaning, data conversion, data analysis and data application modules;
and (3) data acquisition: obtaining raw data from different data sources;
data cleaning: screening, de-duplication and filling missing value pretreatment operation is carried out on the data;
data conversion: converting the data format into a required form;
data analysis: carrying out statistics, mining and modeling analysis on the data;
data application: and applying the analysis result to the business scene.
Defining a data processing module:
at this step, the data processing tasks that need to be performed are defined according to the traffic demands. These tasks may include data acquisition, data cleansing, data conversion, data analysis, and data application modules. Data acquisition refers to acquiring original data from different data sources; the data cleaning is to perform screening, de-duplication and missing value filling pretreatment operation on the data; data conversion is the conversion of a data format into a desired form, such as converting data from structured to unstructured or vice versa; the data analysis is to perform statistics, mining and modeling analysis operation on the data; the data application is to apply the analysis result to the business scenario.
And (3) data acquisition: data acquisition refers to the acquisition of raw data from different data sources. These data sources may include databases, log files, sensors, API interfaces. The purpose of the data acquisition is to collect the required data into a centralized location for subsequent processing and analysis.
Data cleaning: the data cleaning is to perform preprocessing operation on the original data so as to ensure the quality and accuracy of the data. The cleansing operation includes filtering the data, removing invalid or erroneous data, deduplicating the data, and filling in missing values. Through data cleaning, the reliability and consistency of the data can be improved.
Data conversion: data conversion is the conversion of data from one format or form to another to meet specific requirements. For example, converting structured data to unstructured data, or converting unstructured data to structured data. The data conversion may also include normalization, encoding operations of the data for subsequent analysis and application.
Data analysis: data analysis is the statistical, mining, modeling operations performed on data to extract valuable information and insight. The data analysis may include descriptive statistical analysis, data mining algorithm application, machine learning model training. Through data analysis, the relevance, trend and abnormality among the data can be revealed, and support is provided for business decision.
Data application: the data application is to apply the result of data analysis to a specific business scenario to achieve a business objective. Data applications may include generating reports, formulating policies, optimizing flows, making predictions and recommending. The results of the data analysis can be translated into actual actions and values through the data application.
The design method of the design data processing module comprises the following steps:
designing a data processing flow according to the defined data processing task;
and (3) scheduling a data flow: determining the execution sequence and the dependency relationship of the data processing tasks, and ensuring the correct circulation of the data in the flow;
data service call: and calling corresponding data services, such as a data storage service and a data calculation service, according to the requirements.
And (3) designing a data processing module:
and designing a data processing flow according to the defined data processing task. This flow includes two main aspects, data flow scheduling and data service invocation. The data flow scheduling is to determine the execution sequence and the dependency relationship of the data processing tasks and ensure the correct circulation of the data in the flow. The data service call refers to calling corresponding data services, such as data storage service and data calculation service, according to the need.
And (3) scheduling a data flow:
determining a task execution sequence: and determining the execution sequence of the tasks according to the dependency relationship among the tasks. Some tasks may need to be performed after other tasks are completed, while some tasks may be performed in parallel.
Establishing task dependency relationship: the dependency relationship between tasks is determined, i.e. the execution of certain tasks depends on the output results of other tasks. Thus, the data can be ensured to flow correctly in the process, and conflicts or data inconsistency among tasks are avoided.
Scheduling task execution: and scheduling the tasks according to the execution sequence and the dependency relationship of the tasks, so as to ensure that each task is executed at a proper time. Task scheduling may be implemented using scheduling mechanisms in a workflow management system or programming language.
Data service call:
data storage service: the appropriate data storage services are invoked to store and manage data as needed. This may include relational databases, noSQL databases, data warehouses. The data storage service may be used to save raw data, intermediate results, and final results for subsequent use and querying.
Data computing service: and calling corresponding data computing service to execute data processing operation according to task requirements. This may include data cleansing, data conversion, data analysis operations. The data computing service may be a self-developed code module, a third party library, or a service provided on a cloud platform.
Other data services: other data services, such as data API services, data collection tools, may also need to be invoked, depending on the particular needs. These services may help to obtain data from external data sources or provide other functionality required for data processing.
The method for integrating the data processing module comprises the following steps:
integrating the realized data processing function module into a service center;
independent operation and management of the data processing function are realized;
the business center is a unified platform for integrating functions and services of various business systems, including data processing functions.
An integrated data processing module:
the data processing function module is integrated into the business center station, so that the independent operation and management of the data processing function are realized. The business center is a unified platform for integrating functions and services of various business systems, including data processing functions. By integrating the data processing function into the service center, unified data processing flow and management can be realized.
Constructing a service center: first, a service center needs to be established, and the center may be a unified platform or framework for integrating functions and services of each service system. The middlebox may provide unified data processing flows and management, as well as other shared functions and services, such as user management, rights management, logging.
Defining interfaces and specifications: in a middlebox, appropriate interfaces and specifications need to be defined for interaction with the data processing function modules. These interfaces and specifications may include formats for data input and output, calling style, and parameter definitions. By defining clear interfaces and specifications, compatibility and scalability between the data processing function module and the service center can be ensured.
An integrated data processing functional module: and integrating the realized data processing function module into the business center. This may be achieved by deploying the functional modules to the servers of the service center and adapting and integrating according to the interfaces and specifications. The functional modules may exist in the form of separate services, libraries or plug-ins for ease of administration and invocation.
The realization of the data processing flow: and in the service center, according to the defined data processing flow and task scheduling, calling a corresponding data processing function module, and carrying out data processing according to a preset sequence and a dependency relationship. The data can flow among different functional modules, and the processing result is finally obtained through the links of data acquisition, cleaning, conversion and analysis.
Providing management and monitoring functions: the service center can provide management and monitoring functions of data processing functions, including task scheduling, task state monitoring, log recording and error processing. The system can help operation staff monitor and manage the data processing function, and ensure the stability and reliability of data processing.
The method for providing the data service interface comprises the following steps:
providing data service interfaces for business systems and other application systems according to business requirements;
realizing the sharing and application of data;
other systems can conveniently call the data processing function, acquire the processed data result and apply the processed data result to own business scene.
Providing a data service interface module:
according to the service requirement, a data service interface is provided for a service system and other application systems, so that the sharing and application of data are realized. By providing the data service interface, other systems can conveniently call the data processing function, acquire the processed data result and apply the processed data result to own service scene. Therefore, sharing and multiplexing of data can be realized, and cooperation and efficiency among systems are improved.
Defining a data service interface: and defining a proper data service interface according to the service requirement and the output result of the data processing functional module. These interfaces should explicitly define the input parameters, the format and content of the output results, and the manner of invocation and rights control.
Realizing a data service interface: and according to the defined interface, the data processing function module is packaged into a callable data service. This can be achieved by writing API interfaces, web services, and micro-services. Ensuring good availability, reliability and performance of the data service interface.
Data sharing and application: by providing the data service interface, the business system and other application systems can conveniently call the data processing function, acquire the processed data result and apply the processed data result to own business scene. Therefore, the sharing and multiplexing of data can be realized, repeated data processing work is reduced, and the cooperation and efficiency between systems are improved.
Interface security and rights control: security and authority control of the data service interface are ensured. Authentication, authorization mechanisms, encrypted transmission means may be used to limit access to the data services and ensure that only authorized systems and users may invoke the data service interface.
Document and test: writing detailed documents for the data service interface, including the use method of the interface, parameter description and interpretation of returned results. And meanwhile, the correctness and the stability of the data service interfaces are verified by fully testing, so that the data service interfaces can work according to expectations.
The method of the data quality monitoring module comprises the following functions:
monitoring the accuracy, integrity and consistency index of the data;
timely finding and processing the data quality problem;
the method comprises the functions of data anomaly detection, data consistency verification and data integrity verification.
And the data quality monitoring module is used for:
in data processing tasks, the quality of the data is critical. The data quality monitoring module can monitor the accuracy, the integrity and the consistency index of the data and timely discover and process the data quality problem. This module may include data anomaly detection, data consistency verification, data integrity verification functions to help improve the quality and reliability of the data.
Detecting data abnormality:
abnormal value detection: abnormal values or abnormal patterns in the data are identified by statistical methods, rule detection or machine learning algorithms. These outliers may be erroneous recordings, anomalous measurements, or problems with the data source.
And (3) data distribution detection: analyzing the distribution of the data and identifying the data deviating from the normal distribution. For example, it may be detected whether there is a significant skew or outlier.
Data consistency verification:
and (3) data consistency detection: consistency between multiple data sources or data sets is verified. For example, the same fields or identifiers of different data sources may be compared for consistency, and whether there is an inconsistency.
Logic rule detection: logic rules are defined and applied to verify whether the data complies with the expected business logic and rules. For example, it is checked whether the order amount is greater than zero, verifying the order of dates.
Data integrity verification:
and (3) missing value detection: missing values present in the data are identified and processed, such as filling in missing values or estimating by interpolation methods.
Data integrity verification: verifying whether the data is complete, e.g., detecting whether there is a missing record, a lack of critical fields, or an invalid reference.
The method for the data security and privacy protection module comprises the following steps:
data encryption, access control and desensitization treatment measures;
the safety and privacy of the data are protected;
and the method is integrated with a data processing flow, so that the safety and privacy of the data in the processing process are ensured.
And the data security and privacy protection module is used for:
with the widespread use of data processing, security and privacy protection of data is becoming increasingly important. The data security and privacy protection module can take various measures such as data encryption, access control and desensitization treatment, and protect the security and privacy of data. The module can be integrated with a data processing flow, so that the safety and privacy of data in the processing process are ensured.
Data encryption:
encryption of data transmission: and the data is encrypted in the transmission process by using a secure transmission protocol (such as HTTPS), so that the security of the data in network transmission is ensured.
Data storage encryption: the storage of data in the storage medium is encrypted to prevent unauthorized access to sensitive data.
Access control:
identity authentication: the user is required to provide valid authentication information to verify his access rights.
Rights management: and limiting the access and operation authority of the user to the data according to the user roles and authorities, and ensuring that only the authorized user can access the sensitive data.
Audit log: the access and operational behavior of the user is recorded for monitoring and tracking.
Data desensitization:
anonymization: the sensitive data is de-identified, for example, identity information and account numbers are replaced or encrypted by using a desensitization algorithm or a hash function, so as to protect the privacy of users.
Data masking: part or all of the content of the sensitive data is masked or deleted to reduce the exposure risk of the sensitive information.
Data perturbation: the sensitive data is slightly randomized or perturbed to reduce the likelihood that the sensitive data is recovered.
Security audit and monitoring:
security audit: security events, access records and abnormal behavior during data processing are monitored and recorded for subsequent auditing and analysis.
And (3) real-time monitoring: and monitoring the security event in the data processing process in real time, and timely discovering potential security threat.
The method of the data visualization module comprises the following functions:
displaying the data processing result in the form of a chart, a report form and an instrument panel;
providing rich visual functions and interaction modes;
meeting the requirements of different users.
And a data visualization module:
the results of the data processing typically need to be presented to the user or other system in a visual manner. The data visualization module can display the data processing result in the form of a chart, a report and a dashboard, so that a user can more intuitively understand and apply the data. The module can provide rich visual functions and interaction modes, and meets the requirements of different users.
Graph display:
line, bar, pie chart: for showing trends, distributions, duty cycles of the data.
Scatter plot, bubble plot, thermodynamic diagram: for displaying the relevance and distribution density of the data.
Tree diagram, radar diagram, map: for exposing hierarchical relationships, distribution of multidimensional data.
Generating a report:
form report: the data is displayed in a table form, and the functions of sorting, screening and paging can be supported.
Summarizing a report: data are summarized and counted as total, average, percentage.
Dynamic report form: and realizing dynamic display and data screening of report contents through parameter configuration or interactive operation.
Instrument panel and index display:
instrument panel: and displaying the real-time state and trend of the key indexes in the form of an instrument panel for monitoring the service running condition.
Key index card: and displaying the values and the trends of the key indexes so that the user can quickly know the service state.
Interaction function:
filtering and screening: allowing the user to filter and screen the data according to specific conditions to view a subset of the data of interest according to their own needs.
Hover and click: a mouse hovering or clicking function is provided, data detailed information is displayed, and a user is helped to understand data deeply.
Customizable: allowing users to customize chart types, colors and styles to suit personalized needs.
And (3) response type design:
support different devices: ensuring that the data visualization can present a good user experience on different screen sizes and devices.
Dynamic layout: and according to the screen size and the user interaction, the layout of the chart and the components is adjusted to adapt to different scenes.
The method of the data storage and retrieval module comprises the following functions:
providing efficient data storage and retrieval functions;
the method comprises the functions of data persistence storage, index establishment and query optimization;
tightly integrated with the data processing flow, providing fast and reliable data access capabilities.
And the data storage and retrieval module is used for:
during data processing, data needs to be stored and retrieved. The data storage and retrieval module can provide efficient data storage and retrieval functions including data persistence storage, index establishment and query optimization. This module can be tightly integrated with the data processing flow, providing fast and reliable data access capabilities.
Data persistence storage:
database system: persistent storage of data is performed using a relational database (e.g., mySQL, oracle) or a non-relational database (e.g., mongoDB, redis).
File system: the data is stored in the form of files, which may be a local file system or a distributed file system (e.g., hadoop HDFS).
And (3) establishing an index:
index structure: in order to improve the retrieval efficiency of the data, a proper index structure, such as a B+ tree and a hash index, can be designed according to the characteristics of the data.
Index optimization: the performance and the space occupation of the index are optimized by adjusting the size and the partition parameters of the index.
Query optimization:
query analyzer: and analyzing and optimizing the user query to generate an efficient query plan.
Caching mechanism: and the access to the underlying storage system is reduced and the query speed is improved by caching the common data or the query result.
Data backup and recovery:
backing up data: the data is backed up periodically to prevent loss or corruption of the data.
Disaster recovery mechanism: and redundant storage and backup data center measures are adopted, so that the data can be quickly recovered when a disaster occurs.
Distributed storage and retrieval:
distributed file system: data are distributed on a plurality of nodes for storage, and storage capacity and expandability are improved.
Distributed database: and storing the data fragments on a plurality of nodes to realize distributed storage and query.
While the preferred embodiments of the present patent have been described in detail, the present patent is not limited to the above embodiments, and various changes may be made without departing from the spirit of the present patent within the knowledge of those skilled in the art.
Claims (9)
1. A data management system, comprising:
defining a data processing module: defining data processing tasks to be performed according to service requirements;
and (3) designing a data processing module: designing a data processing flow according to the defined data processing task;
realizing a data processing module: according to the requirements of data processing tasks and data processing flows, specific data processing functions are realized;
an integrated data processing module: integrating the realized data processing function module into a service center to realize independent operation and management of the data processing function;
providing a data service interface module: according to the service requirement, a data service interface is provided for a service system and other application systems, so that data sharing and application are realized;
and the data quality monitoring module is used for: monitoring the accuracy, integrity and consistency indexes of the data, and finding and processing the quality problems of the data;
and the data security and privacy protection module is used for: the security and privacy of the data in the processing process are ensured by integrating the data processing flow;
and a data visualization module: displaying the data processing result in the forms of a chart, a report form and an instrument panel, and providing rich visual functions and interaction modes;
and the data storage and retrieval module is used for: providing efficient data storage and retrieval functions, tightly integrated with data processing flows, and providing fast and reliable data access capabilities.
2. The data management system of claim 1, wherein the defining method of defining the data processing module comprises the steps of:
defining data processing tasks according to service requirements, wherein the data processing tasks comprise data acquisition, data cleaning, data conversion, data analysis and data application modules;
and a data acquisition module: obtaining raw data from different data sources;
and a data cleaning module: screening, de-duplication and filling missing value pretreatment operation is carried out on the data;
and a data conversion module: converting the data format into a required form;
and a data analysis module: carrying out statistics, mining and modeling analysis on the data;
and the data application module is used for: and applying the analysis result to the business scene.
3. The data management system of claim 1, wherein the method of designing the data processing module comprises the steps of:
designing a data processing flow according to the defined data processing task;
and (3) scheduling a data flow: determining the execution sequence and the dependency relationship of the data processing tasks, and ensuring the correct circulation of the data in the flow;
data service call: and calling corresponding data services, such as a data storage service and a data calculation service, according to the requirements.
4. The data management system of claim 1, wherein the method of integrating the data processing module comprises the steps of:
integrating the realized data processing function module into a service center;
independent operation and management of the data processing function are realized;
the business center is a unified platform for integrating functions and services of various business systems, including data processing functions.
5. The data management system of claim 1, wherein the method of providing a data service interface comprises the steps of:
providing data service interfaces for business systems and other application systems according to business requirements;
realizing the sharing and application of data;
other systems can conveniently call the data processing function, acquire the processed data result and apply the processed data result to own business scene.
6. The data management system of claim 1, wherein the method of the data quality monitoring module comprises the functions of:
monitoring the accuracy, integrity and consistency index of the data;
timely finding and processing the data quality problem;
the method comprises the functions of data anomaly detection, data consistency verification and data integrity verification.
7. The data governance system of claim 1, wherein the method of the data security and privacy protection module comprises the steps of:
data encryption, access control and desensitization treatment measures;
the safety and privacy of the data are protected;
and the method is integrated with a data processing flow, so that the safety and privacy of the data in the processing process are ensured.
8. The data management system of claim 1, wherein the method of the data visualization module comprises the functions of:
displaying the data processing result in the form of a chart, a report form and an instrument panel;
providing rich visual functions and interaction modes;
meeting the requirements of different users.
9. The data management system of claim 1, wherein the method of the data storage and retrieval module comprises the functions of:
providing efficient data storage and retrieval functions;
the method comprises the functions of data persistence storage, index establishment and query optimization;
tightly integrated with the data processing flow, providing fast and reliable data access capabilities.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310852654.6A CN116910023A (en) | 2023-07-12 | 2023-07-12 | Data management system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310852654.6A CN116910023A (en) | 2023-07-12 | 2023-07-12 | Data management system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116910023A true CN116910023A (en) | 2023-10-20 |
Family
ID=88354319
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310852654.6A Withdrawn CN116910023A (en) | 2023-07-12 | 2023-07-12 | Data management system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116910023A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117131036A (en) * | 2023-10-26 | 2023-11-28 | 环球数科集团有限公司 | Data maintenance system based on big data and artificial intelligence |
-
2023
- 2023-07-12 CN CN202310852654.6A patent/CN116910023A/en not_active Withdrawn
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117131036A (en) * | 2023-10-26 | 2023-11-28 | 环球数科集团有限公司 | Data maintenance system based on big data and artificial intelligence |
CN117131036B (en) * | 2023-10-26 | 2023-12-22 | 环球数科集团有限公司 | Data maintenance system based on big data and artificial intelligence |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11314723B1 (en) | Anomaly detection | |
US10135609B2 (en) | Managing a database management system using a blockchain database | |
US9734230B2 (en) | Cross system analytics for in memory data warehouse | |
US9773048B2 (en) | Historical data for in memory data warehouse | |
US20150074037A1 (en) | In Memory Database Warehouse | |
US11640476B2 (en) | Methods for big data usage monitoring, entitlements and exception analysis | |
US20120254416A1 (en) | Mainframe Event Correlation | |
US20120254337A1 (en) | Mainframe Management Console Monitoring | |
CN112860777B (en) | Data processing method, device and equipment | |
Tsai et al. | Data provenance in SOA: security, reliability, and integrity | |
CN113158233B (en) | Data preprocessing method and device and computer storage medium | |
US8738768B2 (en) | Multiple destinations for mainframe event monitoring | |
CN113722301A (en) | Big data processing method, device and system based on education information and storage medium | |
Seenivasan | ETL (extract, transform, load) best practices | |
KR100903726B1 (en) | System for Evaluating Data Quality Management Maturity | |
CN116910023A (en) | Data management system | |
CN116541372A (en) | Data asset management method and system | |
Bauer et al. | Building and operating a large-scale enterprise data analytics platform | |
CN115396260A (en) | Intelligent medical data gateway system | |
WO2019106177A1 (en) | Automated logging | |
CN109885543A (en) | Log processing method and device based on big data cluster | |
Malik et al. | Big Data: Risk Management & Software Testing | |
KR102656871B1 (en) | Data management device, data management method and a computer-readable storage medium for storing data management program | |
Phoghat et al. | Analysis of security techniques and issues in Data Warehouse | |
KR102657160B1 (en) | Data management device, data management method and a computer-readable storage medium for storing data management program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20231020 |