CN116910023A - Data management system - Google Patents

Data management system Download PDF

Info

Publication number
CN116910023A
CN116910023A CN202310852654.6A CN202310852654A CN116910023A CN 116910023 A CN116910023 A CN 116910023A CN 202310852654 A CN202310852654 A CN 202310852654A CN 116910023 A CN116910023 A CN 116910023A
Authority
CN
China
Prior art keywords
data
data processing
module
service
functions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202310852654.6A
Other languages
Chinese (zh)
Inventor
徐欢
王伟东
王路权
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Quanwang Digital Commerce Technology Co ltd
Original Assignee
Beijing Quanwang Digital Commerce Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Quanwang Digital Commerce Technology Co ltd filed Critical Beijing Quanwang Digital Commerce Technology Co ltd
Priority to CN202310852654.6A priority Critical patent/CN116910023A/en
Publication of CN116910023A publication Critical patent/CN116910023A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures

Abstract

The invention discloses a data management system, which comprises: defining a data processing module; designing a data processing module; realizing a data processing module; an integrated data processing module; providing a data service interface module; a data quality monitoring module; a data security and privacy protection module; a data visualization module; the data storage and retrieval module can effectively define, design, realize, integrate, provide data service, monitor data quality, protect data safety and privacy, perform data visualization and provide data storage and retrieval functions through the cooperative work of the components, and improve the efficiency, consistency and reliability of data processing.

Description

Data management system
Technical Field
The invention relates to a data management system.
Background
The middle-stage data processing system is to build a unified data processing platform in an enterprise, integrate the data processing functions of each service system and realize centralized management, sharing and multiplexing of data.
If the data system is not used in the middle station processing, enterprises can face the conditions of repeated development and maintenance work, data island and information isolation, data inconsistency, inefficiency and resource waste, difficult data quality assurance and lack of uniform data service interfaces. The data processing system of the middle platform can solve the problems, provides a unified data processing platform, realizes centralized management, sharing and multiplexing of data, and improves the efficiency, consistency and reliability of data processing.
Disclosure of Invention
The present invention is directed to a data management system, which solves the above-mentioned problems in the prior art.
In order to achieve the above purpose, the present invention provides the following technical solutions:
a data management system, comprising:
defining a data processing module: defining data processing tasks to be performed according to service requirements;
and (3) designing a data processing module: designing a data processing flow according to the defined data processing task;
realizing a data processing module: according to the requirements of data processing tasks and data processing flows, specific data processing functions are realized;
an integrated data processing module: integrating the realized data processing function module into a service center to realize independent operation and management of the data processing function;
providing a data service interface module: according to the service requirement, a data service interface is provided for a service system and other application systems, so that data sharing and application are realized;
and the data quality monitoring module is used for: monitoring the accuracy, integrity and consistency indexes of the data, and finding and processing the quality problems of the data;
and the data security and privacy protection module is used for: the security and privacy of the data in the processing process are ensured by integrating the data processing flow;
and a data visualization module: displaying the data processing result in the forms of a chart, a report form and an instrument panel, and providing rich visual functions and interaction modes;
and the data storage and retrieval module is used for: providing efficient data storage and retrieval functions, tightly integrated with data processing flows, and providing fast and reliable data access capabilities.
As a further scheme of the invention: the definition method for defining the data processing module comprises the following steps:
defining data processing tasks according to service requirements, wherein the data processing tasks comprise data acquisition, data cleaning, data conversion, data analysis and data application modules;
and (3) data acquisition: obtaining raw data from different data sources;
data cleaning: screening, de-duplication and filling missing value pretreatment operation is carried out on the data;
data conversion: converting the data format into a required form;
data analysis: carrying out statistics, mining and modeling analysis on the data;
data application: and applying the analysis result to the business scene.
As a further scheme of the invention: the design method of the design data processing module comprises the following steps:
designing a data processing flow according to the defined data processing task;
and (3) scheduling a data flow: determining the execution sequence and the dependency relationship of the data processing tasks, and ensuring the correct circulation of the data in the flow;
data service call: and calling corresponding data services, such as a data storage service and a data calculation service, according to the requirements.
As a further scheme of the invention: the method for integrating the data processing module comprises the following steps:
integrating the realized data processing function module into a service center;
independent operation and management of the data processing function are realized;
the business center is a unified platform for integrating functions and services of various business systems, including data processing functions.
As a further scheme of the invention: the method for providing the data service interface comprises the following steps:
providing data service interfaces for business systems and other application systems according to business requirements;
realizing the sharing and application of data;
other systems can conveniently call the data processing function, acquire the processed data result and apply the processed data result to own business scene.
As a further scheme of the invention: the method of the data quality monitoring module comprises the following functions:
monitoring the accuracy, integrity and consistency index of the data;
timely finding and processing the data quality problem;
the method comprises the functions of data anomaly detection, data consistency verification and data integrity verification.
As a further scheme of the invention: the method for the data security and privacy protection module comprises the following steps:
data encryption, access control and desensitization treatment measures;
the safety and privacy of the data are protected;
and the method is integrated with a data processing flow, so that the safety and privacy of the data in the processing process are ensured.
As a further scheme of the invention: the method of the data visualization module comprises the following functions:
displaying the data processing result in the form of a chart, a report form and an instrument panel;
providing rich visual functions and interaction modes;
meeting the requirements of different users.
As a further scheme of the invention: the method of the data storage and retrieval module comprises the following functions:
providing efficient data storage and retrieval functions;
the method comprises the functions of data persistence storage, index establishment and query optimization;
tightly integrated with the data processing flow, providing fast and reliable data access capabilities.
Compared with the prior art, the invention has the beneficial effects that:
the data processing function is decoupled from the service system by adopting the design idea of the service center, so that an independent service center is formed, independent operation and management of the data processing function are realized, the data processing efficiency and flexibility are improved, and meanwhile, the data processing cost and risk are reduced. In addition, the data flow scheduling and data service calling modes are adopted, so that data acquisition, processing, analysis and application are realized, and omnibearing data support is provided for enterprises. The method is suitable for various enterprises including electronic commerce enterprises, financial enterprises, medical enterprises and educational enterprises. The method can be customized and developed according to the characteristics and requirements of different enterprises, and meets the personalized business requirements of the enterprises. In addition, the method can also monitor and manage the data processing task and the data service interface in real time, thereby improving the reliability and stability of data processing and reducing the cost and risk of data processing.
Drawings
FIG. 1 is a block diagram of a data management system.
Description of the embodiments
The technical scheme of the patent is further described in detail below with reference to the specific embodiments.
Referring to fig. 1, a data management system includes:
defining a data processing module: defining data processing tasks to be performed according to service requirements;
and (3) designing a data processing module: designing a data processing flow according to the defined data processing task;
realizing a data processing module: according to the requirements of data processing tasks and data processing flows, specific data processing functions are realized;
an integrated data processing module: integrating the realized data processing function module into a service center to realize independent operation and management of the data processing function;
providing a data service interface module: according to the service requirement, a data service interface is provided for a service system and other application systems, so that data sharing and application are realized;
and the data quality monitoring module is used for: monitoring the accuracy, integrity and consistency indexes of the data, and finding and processing the quality problems of the data;
and the data security and privacy protection module is used for: the security and privacy of the data in the processing process are ensured by integrating the data processing flow;
and a data visualization module: displaying the data processing result in the forms of a chart, a report form and an instrument panel, and providing rich visual functions and interaction modes;
and the data storage and retrieval module is used for: providing efficient data storage and retrieval functions, tightly integrated with data processing flows, and providing fast and reliable data access capabilities.
The definition method for defining the data processing module comprises the following steps:
defining data processing tasks according to service requirements, wherein the data processing tasks comprise data acquisition, data cleaning, data conversion, data analysis and data application modules;
and (3) data acquisition: obtaining raw data from different data sources;
data cleaning: screening, de-duplication and filling missing value pretreatment operation is carried out on the data;
data conversion: converting the data format into a required form;
data analysis: carrying out statistics, mining and modeling analysis on the data;
data application: and applying the analysis result to the business scene.
Defining a data processing module:
at this step, the data processing tasks that need to be performed are defined according to the traffic demands. These tasks may include data acquisition, data cleansing, data conversion, data analysis, and data application modules. Data acquisition refers to acquiring original data from different data sources; the data cleaning is to perform screening, de-duplication and missing value filling pretreatment operation on the data; data conversion is the conversion of a data format into a desired form, such as converting data from structured to unstructured or vice versa; the data analysis is to perform statistics, mining and modeling analysis operation on the data; the data application is to apply the analysis result to the business scenario.
And (3) data acquisition: data acquisition refers to the acquisition of raw data from different data sources. These data sources may include databases, log files, sensors, API interfaces. The purpose of the data acquisition is to collect the required data into a centralized location for subsequent processing and analysis.
Data cleaning: the data cleaning is to perform preprocessing operation on the original data so as to ensure the quality and accuracy of the data. The cleansing operation includes filtering the data, removing invalid or erroneous data, deduplicating the data, and filling in missing values. Through data cleaning, the reliability and consistency of the data can be improved.
Data conversion: data conversion is the conversion of data from one format or form to another to meet specific requirements. For example, converting structured data to unstructured data, or converting unstructured data to structured data. The data conversion may also include normalization, encoding operations of the data for subsequent analysis and application.
Data analysis: data analysis is the statistical, mining, modeling operations performed on data to extract valuable information and insight. The data analysis may include descriptive statistical analysis, data mining algorithm application, machine learning model training. Through data analysis, the relevance, trend and abnormality among the data can be revealed, and support is provided for business decision.
Data application: the data application is to apply the result of data analysis to a specific business scenario to achieve a business objective. Data applications may include generating reports, formulating policies, optimizing flows, making predictions and recommending. The results of the data analysis can be translated into actual actions and values through the data application.
The design method of the design data processing module comprises the following steps:
designing a data processing flow according to the defined data processing task;
and (3) scheduling a data flow: determining the execution sequence and the dependency relationship of the data processing tasks, and ensuring the correct circulation of the data in the flow;
data service call: and calling corresponding data services, such as a data storage service and a data calculation service, according to the requirements.
And (3) designing a data processing module:
and designing a data processing flow according to the defined data processing task. This flow includes two main aspects, data flow scheduling and data service invocation. The data flow scheduling is to determine the execution sequence and the dependency relationship of the data processing tasks and ensure the correct circulation of the data in the flow. The data service call refers to calling corresponding data services, such as data storage service and data calculation service, according to the need.
And (3) scheduling a data flow:
determining a task execution sequence: and determining the execution sequence of the tasks according to the dependency relationship among the tasks. Some tasks may need to be performed after other tasks are completed, while some tasks may be performed in parallel.
Establishing task dependency relationship: the dependency relationship between tasks is determined, i.e. the execution of certain tasks depends on the output results of other tasks. Thus, the data can be ensured to flow correctly in the process, and conflicts or data inconsistency among tasks are avoided.
Scheduling task execution: and scheduling the tasks according to the execution sequence and the dependency relationship of the tasks, so as to ensure that each task is executed at a proper time. Task scheduling may be implemented using scheduling mechanisms in a workflow management system or programming language.
Data service call:
data storage service: the appropriate data storage services are invoked to store and manage data as needed. This may include relational databases, noSQL databases, data warehouses. The data storage service may be used to save raw data, intermediate results, and final results for subsequent use and querying.
Data computing service: and calling corresponding data computing service to execute data processing operation according to task requirements. This may include data cleansing, data conversion, data analysis operations. The data computing service may be a self-developed code module, a third party library, or a service provided on a cloud platform.
Other data services: other data services, such as data API services, data collection tools, may also need to be invoked, depending on the particular needs. These services may help to obtain data from external data sources or provide other functionality required for data processing.
The method for integrating the data processing module comprises the following steps:
integrating the realized data processing function module into a service center;
independent operation and management of the data processing function are realized;
the business center is a unified platform for integrating functions and services of various business systems, including data processing functions.
An integrated data processing module:
the data processing function module is integrated into the business center station, so that the independent operation and management of the data processing function are realized. The business center is a unified platform for integrating functions and services of various business systems, including data processing functions. By integrating the data processing function into the service center, unified data processing flow and management can be realized.
Constructing a service center: first, a service center needs to be established, and the center may be a unified platform or framework for integrating functions and services of each service system. The middlebox may provide unified data processing flows and management, as well as other shared functions and services, such as user management, rights management, logging.
Defining interfaces and specifications: in a middlebox, appropriate interfaces and specifications need to be defined for interaction with the data processing function modules. These interfaces and specifications may include formats for data input and output, calling style, and parameter definitions. By defining clear interfaces and specifications, compatibility and scalability between the data processing function module and the service center can be ensured.
An integrated data processing functional module: and integrating the realized data processing function module into the business center. This may be achieved by deploying the functional modules to the servers of the service center and adapting and integrating according to the interfaces and specifications. The functional modules may exist in the form of separate services, libraries or plug-ins for ease of administration and invocation.
The realization of the data processing flow: and in the service center, according to the defined data processing flow and task scheduling, calling a corresponding data processing function module, and carrying out data processing according to a preset sequence and a dependency relationship. The data can flow among different functional modules, and the processing result is finally obtained through the links of data acquisition, cleaning, conversion and analysis.
Providing management and monitoring functions: the service center can provide management and monitoring functions of data processing functions, including task scheduling, task state monitoring, log recording and error processing. The system can help operation staff monitor and manage the data processing function, and ensure the stability and reliability of data processing.
The method for providing the data service interface comprises the following steps:
providing data service interfaces for business systems and other application systems according to business requirements;
realizing the sharing and application of data;
other systems can conveniently call the data processing function, acquire the processed data result and apply the processed data result to own business scene.
Providing a data service interface module:
according to the service requirement, a data service interface is provided for a service system and other application systems, so that the sharing and application of data are realized. By providing the data service interface, other systems can conveniently call the data processing function, acquire the processed data result and apply the processed data result to own service scene. Therefore, sharing and multiplexing of data can be realized, and cooperation and efficiency among systems are improved.
Defining a data service interface: and defining a proper data service interface according to the service requirement and the output result of the data processing functional module. These interfaces should explicitly define the input parameters, the format and content of the output results, and the manner of invocation and rights control.
Realizing a data service interface: and according to the defined interface, the data processing function module is packaged into a callable data service. This can be achieved by writing API interfaces, web services, and micro-services. Ensuring good availability, reliability and performance of the data service interface.
Data sharing and application: by providing the data service interface, the business system and other application systems can conveniently call the data processing function, acquire the processed data result and apply the processed data result to own business scene. Therefore, the sharing and multiplexing of data can be realized, repeated data processing work is reduced, and the cooperation and efficiency between systems are improved.
Interface security and rights control: security and authority control of the data service interface are ensured. Authentication, authorization mechanisms, encrypted transmission means may be used to limit access to the data services and ensure that only authorized systems and users may invoke the data service interface.
Document and test: writing detailed documents for the data service interface, including the use method of the interface, parameter description and interpretation of returned results. And meanwhile, the correctness and the stability of the data service interfaces are verified by fully testing, so that the data service interfaces can work according to expectations.
The method of the data quality monitoring module comprises the following functions:
monitoring the accuracy, integrity and consistency index of the data;
timely finding and processing the data quality problem;
the method comprises the functions of data anomaly detection, data consistency verification and data integrity verification.
And the data quality monitoring module is used for:
in data processing tasks, the quality of the data is critical. The data quality monitoring module can monitor the accuracy, the integrity and the consistency index of the data and timely discover and process the data quality problem. This module may include data anomaly detection, data consistency verification, data integrity verification functions to help improve the quality and reliability of the data.
Detecting data abnormality:
abnormal value detection: abnormal values or abnormal patterns in the data are identified by statistical methods, rule detection or machine learning algorithms. These outliers may be erroneous recordings, anomalous measurements, or problems with the data source.
And (3) data distribution detection: analyzing the distribution of the data and identifying the data deviating from the normal distribution. For example, it may be detected whether there is a significant skew or outlier.
Data consistency verification:
and (3) data consistency detection: consistency between multiple data sources or data sets is verified. For example, the same fields or identifiers of different data sources may be compared for consistency, and whether there is an inconsistency.
Logic rule detection: logic rules are defined and applied to verify whether the data complies with the expected business logic and rules. For example, it is checked whether the order amount is greater than zero, verifying the order of dates.
Data integrity verification:
and (3) missing value detection: missing values present in the data are identified and processed, such as filling in missing values or estimating by interpolation methods.
Data integrity verification: verifying whether the data is complete, e.g., detecting whether there is a missing record, a lack of critical fields, or an invalid reference.
The method for the data security and privacy protection module comprises the following steps:
data encryption, access control and desensitization treatment measures;
the safety and privacy of the data are protected;
and the method is integrated with a data processing flow, so that the safety and privacy of the data in the processing process are ensured.
And the data security and privacy protection module is used for:
with the widespread use of data processing, security and privacy protection of data is becoming increasingly important. The data security and privacy protection module can take various measures such as data encryption, access control and desensitization treatment, and protect the security and privacy of data. The module can be integrated with a data processing flow, so that the safety and privacy of data in the processing process are ensured.
Data encryption:
encryption of data transmission: and the data is encrypted in the transmission process by using a secure transmission protocol (such as HTTPS), so that the security of the data in network transmission is ensured.
Data storage encryption: the storage of data in the storage medium is encrypted to prevent unauthorized access to sensitive data.
Access control:
identity authentication: the user is required to provide valid authentication information to verify his access rights.
Rights management: and limiting the access and operation authority of the user to the data according to the user roles and authorities, and ensuring that only the authorized user can access the sensitive data.
Audit log: the access and operational behavior of the user is recorded for monitoring and tracking.
Data desensitization:
anonymization: the sensitive data is de-identified, for example, identity information and account numbers are replaced or encrypted by using a desensitization algorithm or a hash function, so as to protect the privacy of users.
Data masking: part or all of the content of the sensitive data is masked or deleted to reduce the exposure risk of the sensitive information.
Data perturbation: the sensitive data is slightly randomized or perturbed to reduce the likelihood that the sensitive data is recovered.
Security audit and monitoring:
security audit: security events, access records and abnormal behavior during data processing are monitored and recorded for subsequent auditing and analysis.
And (3) real-time monitoring: and monitoring the security event in the data processing process in real time, and timely discovering potential security threat.
The method of the data visualization module comprises the following functions:
displaying the data processing result in the form of a chart, a report form and an instrument panel;
providing rich visual functions and interaction modes;
meeting the requirements of different users.
And a data visualization module:
the results of the data processing typically need to be presented to the user or other system in a visual manner. The data visualization module can display the data processing result in the form of a chart, a report and a dashboard, so that a user can more intuitively understand and apply the data. The module can provide rich visual functions and interaction modes, and meets the requirements of different users.
Graph display:
line, bar, pie chart: for showing trends, distributions, duty cycles of the data.
Scatter plot, bubble plot, thermodynamic diagram: for displaying the relevance and distribution density of the data.
Tree diagram, radar diagram, map: for exposing hierarchical relationships, distribution of multidimensional data.
Generating a report:
form report: the data is displayed in a table form, and the functions of sorting, screening and paging can be supported.
Summarizing a report: data are summarized and counted as total, average, percentage.
Dynamic report form: and realizing dynamic display and data screening of report contents through parameter configuration or interactive operation.
Instrument panel and index display:
instrument panel: and displaying the real-time state and trend of the key indexes in the form of an instrument panel for monitoring the service running condition.
Key index card: and displaying the values and the trends of the key indexes so that the user can quickly know the service state.
Interaction function:
filtering and screening: allowing the user to filter and screen the data according to specific conditions to view a subset of the data of interest according to their own needs.
Hover and click: a mouse hovering or clicking function is provided, data detailed information is displayed, and a user is helped to understand data deeply.
Customizable: allowing users to customize chart types, colors and styles to suit personalized needs.
And (3) response type design:
support different devices: ensuring that the data visualization can present a good user experience on different screen sizes and devices.
Dynamic layout: and according to the screen size and the user interaction, the layout of the chart and the components is adjusted to adapt to different scenes.
The method of the data storage and retrieval module comprises the following functions:
providing efficient data storage and retrieval functions;
the method comprises the functions of data persistence storage, index establishment and query optimization;
tightly integrated with the data processing flow, providing fast and reliable data access capabilities.
And the data storage and retrieval module is used for:
during data processing, data needs to be stored and retrieved. The data storage and retrieval module can provide efficient data storage and retrieval functions including data persistence storage, index establishment and query optimization. This module can be tightly integrated with the data processing flow, providing fast and reliable data access capabilities.
Data persistence storage:
database system: persistent storage of data is performed using a relational database (e.g., mySQL, oracle) or a non-relational database (e.g., mongoDB, redis).
File system: the data is stored in the form of files, which may be a local file system or a distributed file system (e.g., hadoop HDFS).
And (3) establishing an index:
index structure: in order to improve the retrieval efficiency of the data, a proper index structure, such as a B+ tree and a hash index, can be designed according to the characteristics of the data.
Index optimization: the performance and the space occupation of the index are optimized by adjusting the size and the partition parameters of the index.
Query optimization:
query analyzer: and analyzing and optimizing the user query to generate an efficient query plan.
Caching mechanism: and the access to the underlying storage system is reduced and the query speed is improved by caching the common data or the query result.
Data backup and recovery:
backing up data: the data is backed up periodically to prevent loss or corruption of the data.
Disaster recovery mechanism: and redundant storage and backup data center measures are adopted, so that the data can be quickly recovered when a disaster occurs.
Distributed storage and retrieval:
distributed file system: data are distributed on a plurality of nodes for storage, and storage capacity and expandability are improved.
Distributed database: and storing the data fragments on a plurality of nodes to realize distributed storage and query.
While the preferred embodiments of the present patent have been described in detail, the present patent is not limited to the above embodiments, and various changes may be made without departing from the spirit of the present patent within the knowledge of those skilled in the art.

Claims (9)

1. A data management system, comprising:
defining a data processing module: defining data processing tasks to be performed according to service requirements;
and (3) designing a data processing module: designing a data processing flow according to the defined data processing task;
realizing a data processing module: according to the requirements of data processing tasks and data processing flows, specific data processing functions are realized;
an integrated data processing module: integrating the realized data processing function module into a service center to realize independent operation and management of the data processing function;
providing a data service interface module: according to the service requirement, a data service interface is provided for a service system and other application systems, so that data sharing and application are realized;
and the data quality monitoring module is used for: monitoring the accuracy, integrity and consistency indexes of the data, and finding and processing the quality problems of the data;
and the data security and privacy protection module is used for: the security and privacy of the data in the processing process are ensured by integrating the data processing flow;
and a data visualization module: displaying the data processing result in the forms of a chart, a report form and an instrument panel, and providing rich visual functions and interaction modes;
and the data storage and retrieval module is used for: providing efficient data storage and retrieval functions, tightly integrated with data processing flows, and providing fast and reliable data access capabilities.
2. The data management system of claim 1, wherein the defining method of defining the data processing module comprises the steps of:
defining data processing tasks according to service requirements, wherein the data processing tasks comprise data acquisition, data cleaning, data conversion, data analysis and data application modules;
and a data acquisition module: obtaining raw data from different data sources;
and a data cleaning module: screening, de-duplication and filling missing value pretreatment operation is carried out on the data;
and a data conversion module: converting the data format into a required form;
and a data analysis module: carrying out statistics, mining and modeling analysis on the data;
and the data application module is used for: and applying the analysis result to the business scene.
3. The data management system of claim 1, wherein the method of designing the data processing module comprises the steps of:
designing a data processing flow according to the defined data processing task;
and (3) scheduling a data flow: determining the execution sequence and the dependency relationship of the data processing tasks, and ensuring the correct circulation of the data in the flow;
data service call: and calling corresponding data services, such as a data storage service and a data calculation service, according to the requirements.
4. The data management system of claim 1, wherein the method of integrating the data processing module comprises the steps of:
integrating the realized data processing function module into a service center;
independent operation and management of the data processing function are realized;
the business center is a unified platform for integrating functions and services of various business systems, including data processing functions.
5. The data management system of claim 1, wherein the method of providing a data service interface comprises the steps of:
providing data service interfaces for business systems and other application systems according to business requirements;
realizing the sharing and application of data;
other systems can conveniently call the data processing function, acquire the processed data result and apply the processed data result to own business scene.
6. The data management system of claim 1, wherein the method of the data quality monitoring module comprises the functions of:
monitoring the accuracy, integrity and consistency index of the data;
timely finding and processing the data quality problem;
the method comprises the functions of data anomaly detection, data consistency verification and data integrity verification.
7. The data governance system of claim 1, wherein the method of the data security and privacy protection module comprises the steps of:
data encryption, access control and desensitization treatment measures;
the safety and privacy of the data are protected;
and the method is integrated with a data processing flow, so that the safety and privacy of the data in the processing process are ensured.
8. The data management system of claim 1, wherein the method of the data visualization module comprises the functions of:
displaying the data processing result in the form of a chart, a report form and an instrument panel;
providing rich visual functions and interaction modes;
meeting the requirements of different users.
9. The data management system of claim 1, wherein the method of the data storage and retrieval module comprises the functions of:
providing efficient data storage and retrieval functions;
the method comprises the functions of data persistence storage, index establishment and query optimization;
tightly integrated with the data processing flow, providing fast and reliable data access capabilities.
CN202310852654.6A 2023-07-12 2023-07-12 Data management system Withdrawn CN116910023A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310852654.6A CN116910023A (en) 2023-07-12 2023-07-12 Data management system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310852654.6A CN116910023A (en) 2023-07-12 2023-07-12 Data management system

Publications (1)

Publication Number Publication Date
CN116910023A true CN116910023A (en) 2023-10-20

Family

ID=88354319

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310852654.6A Withdrawn CN116910023A (en) 2023-07-12 2023-07-12 Data management system

Country Status (1)

Country Link
CN (1) CN116910023A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117131036A (en) * 2023-10-26 2023-11-28 环球数科集团有限公司 Data maintenance system based on big data and artificial intelligence

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117131036A (en) * 2023-10-26 2023-11-28 环球数科集团有限公司 Data maintenance system based on big data and artificial intelligence
CN117131036B (en) * 2023-10-26 2023-12-22 环球数科集团有限公司 Data maintenance system based on big data and artificial intelligence

Similar Documents

Publication Publication Date Title
US11314723B1 (en) Anomaly detection
US10135609B2 (en) Managing a database management system using a blockchain database
US9734230B2 (en) Cross system analytics for in memory data warehouse
US9773048B2 (en) Historical data for in memory data warehouse
CN109471846A (en) User behavior auditing system and method on a kind of cloud based on cloud log analysis
US20150074037A1 (en) In Memory Database Warehouse
US8738767B2 (en) Mainframe management console monitoring
US11640476B2 (en) Methods for big data usage monitoring, entitlements and exception analysis
US20120254416A1 (en) Mainframe Event Correlation
Tsai et al. Data provenance in SOA: security, reliability, and integrity
CN112860777B (en) Data processing method, device and equipment
US8738768B2 (en) Multiple destinations for mainframe event monitoring
CN116910023A (en) Data management system
CN113722301A (en) Big data processing method, device and system based on education information and storage medium
Seenivasan ETL (extract, transform, load) best practices
Bauer et al. Building and operating a large-scale enterprise data analytics platform
KR100903726B1 (en) System for Evaluating Data Quality Management Maturity
CN116541372A (en) Data asset management method and system
CN115396260A (en) Intelligent medical data gateway system
WO2019106177A1 (en) Automated logging
CN109885543A (en) Log processing method and device based on big data cluster
Malik et al. Big Data: Risk Management & Software Testing
KR102656871B1 (en) Data management device, data management method and a computer-readable storage medium for storing data management program
Phoghat et al. Analysis of security techniques and issues in Data Warehouse
KR102657160B1 (en) Data management device, data management method and a computer-readable storage medium for storing data management program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20231020