CN118193502B - Method and computer readable medium for real-time data migration - Google Patents
Method and computer readable medium for real-time data migration Download PDFInfo
- Publication number
- CN118193502B CN118193502B CN202410586693.0A CN202410586693A CN118193502B CN 118193502 B CN118193502 B CN 118193502B CN 202410586693 A CN202410586693 A CN 202410586693A CN 118193502 B CN118193502 B CN 118193502B
- Authority
- CN
- China
- Prior art keywords
- data
- time
- real
- generating
- task
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000005012 migration Effects 0.000 title claims abstract description 99
- 238000013508 migration Methods 0.000 title claims abstract description 99
- 238000000034 method Methods 0.000 title claims abstract description 37
- 238000004458 analytical method Methods 0.000 claims abstract description 48
- 238000012545 processing Methods 0.000 claims abstract description 38
- 230000005540 biological transmission Effects 0.000 claims abstract description 16
- 238000010206 sensitivity analysis Methods 0.000 claims abstract description 7
- 238000013468 resource allocation Methods 0.000 claims description 53
- 230000008569 process Effects 0.000 claims description 18
- 238000004422 calculation algorithm Methods 0.000 claims description 15
- 230000035945 sensitivity Effects 0.000 claims description 15
- 238000011156 evaluation Methods 0.000 claims description 12
- 230000001419 dependent effect Effects 0.000 claims description 8
- 238000012502 risk assessment Methods 0.000 claims description 6
- 238000011157 data evaluation Methods 0.000 claims description 5
- 238000000586 desensitisation Methods 0.000 claims description 5
- 238000005457 optimization Methods 0.000 claims description 5
- 238000009499 grossing Methods 0.000 claims description 4
- 238000004590 computer program Methods 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 238000013523 data management Methods 0.000 abstract description 6
- 238000007726 management method Methods 0.000 abstract description 6
- 238000004364 calculation method Methods 0.000 abstract description 4
- 230000004044 response Effects 0.000 abstract description 3
- 239000002699 waste material Substances 0.000 abstract description 3
- 238000005516 engineering process Methods 0.000 description 12
- 238000007405 data analysis Methods 0.000 description 2
- 238000012417 linear regression Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 208000025174 PANDAS Diseases 0.000 description 1
- 208000021155 Paediatric autoimmune neuropsychiatric disorders associated with streptococcal infection Diseases 0.000 description 1
- 240000000220 Panda oleosa Species 0.000 description 1
- 235000016496 Panda oleosa Nutrition 0.000 description 1
- 238000012300 Sequence Analysis Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000013079 data visualisation Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000005206 flow analysis Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/214—Database migration support
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/217—Database tuning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/602—Providing cryptographic facilities or services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/604—Tools and structures for managing or administering access control systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6227—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5033—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering data affinity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5038—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/70—Admission control; Resource allocation
- H04L47/80—Actions related to the user profile or the type of traffic
- H04L47/805—QOS or priority aware
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/5021—Priority
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/21—Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/2113—Multi-level security, e.g. mandatory access control
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/21—Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/2141—Access rights, e.g. capability lists, access control lists, access tables, access matrices
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Bioethics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Data Mining & Analysis (AREA)
- Automation & Control Theory (AREA)
- Signal Processing (AREA)
- Medical Informatics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention relates to the technical field of data management, in particular to a real-time data migration method and a computer readable medium, which are used for the real-time data migration method and comprise the following steps: based on the migrated data content, sensitive analysis is performed on the data, sensitive data comprising personal identity information and financial data is identified, security and privacy protection of the data are optimized, and a sensitive data index is generated. According to the invention, the sensitivity analysis is carried out on the data, the data safety and privacy protection level are improved, the data safety during transmission between a plurality of systems and platforms is optimized, the urgency degree and the processing priority of the data flow are identified, the flexibility and the response speed of data flow management are optimized, the efficiency of a real-time data migration task is optimized, the real-time network bandwidth is allocated and adjusted through calculation resources, the utilization efficiency of network resources is optimized, the data consistency is optimized, the access delay is reduced, and the delay and the resource waste of task execution are reduced.
Description
Technical Field
The present invention relates to the field of data management technology, and in particular, to a method and a computer readable medium for real-time data migration.
Background
The technical field of data management aims to efficiently store, retrieve, protect and process data, including aspects of data architecture, database management, data security and data quality control, the data management technology is a basis for maintaining the efficiency of an enterprise information system, supports multiple applications of data analysis, data mining and data visualization, and the effective data management is helpful for improving decision quality, increasing operation efficiency and protecting the data from being affected by security threats.
The method for real-time data migration aims at realizing the real-time synchronization and updating of data among a plurality of systems, ensuring the real-time performance and accuracy of the data, supporting the data consistency across the systems, reducing the data access delay and enhancing the flexibility of data processing and the response speed of the systems through the implementation of service continuity, system upgrading, data concentration and data distribution strategies.
The traditional data migration technology lacks flexibility and efficiency in processing real-time data migration and synchronization, comprises poor performance in the problems of cross-system data consistency and data access delay, lacks privacy protection measures, has loopholes in data security and privacy protection, leads to easy illegal access or leakage of data in the migration process, does not support dynamic adjustment according to the change of data flow in the resource configuration, leads to uneven resource allocation and waste of computing resources, influences the overall service execution efficiency, cannot carry out bandwidth dynamic allocation, leads to network congestion and data transmission delay, and has low efficiency of data management and data migration.
Disclosure of Invention
The object of the present invention is to solve the drawbacks of the prior art and to propose a method and a computer-readable medium for live data migration.
In order to achieve the above object, the present invention adopts the following technical scheme, which is used for a live data migration method, and includes the following steps:
S1: based on the content of the migration data, performing sensitivity analysis on the data, identifying sensitive data comprising personal identity information and financial data, optimizing the security and privacy protection of the data, and generating a sensitive data index;
S2: based on the sensitive data index, combining a user role and an access strategy, applying data encryption, optimizing the security and the access efficiency of data in transmission, and generating encrypted mask data;
S3: identifying input and output data requirements of a plurality of real-time data migration tasks by using the encrypted mask data, analyzing the dependency relationship among the plurality of real-time data migration tasks, and generating a task dependency relationship graph;
S4: based on the task dependency graph, the demands of a plurality of data migration tasks on the computing resources are evaluated, and the computing resource allocation is optimized by combining the execution progress of the migration tasks and the resource utilization rate, so that a computing resource allocation table is generated;
s5: based on the computing resource allocation table, identifying the urgency degree and the processing priority of the data stream by analyzing the generation time, the finishing deadline and the resource requirement of the data stream, and generating a data stream priority analysis result;
S6: based on the data flow priority analysis result, the network and bandwidth allocation of a plurality of data migration tasks are adjusted in real time, the utilization efficiency of network resources is optimized, and a real-time network scheduling result is generated.
As a further scheme of the invention, the sensitive data index comprises a personal identity information identification result, a financial data classification result and data sensitive level evaluation information, the encrypted mask data comprises a data access authority table based on role access control, an encrypted data set and data retrieval speed optimization parameters, the task dependency graph comprises a task-to-task data flow graph, a task execution sequence and resource dependency details, the computing resource allocation table comprises an allocated CPU time segment, a memory allocation condition and a resource reallocation plan, the data flow priority analysis result comprises a priority task list, a task emergency processing requirement and a predicted resource adjustment strategy, and the real-time network scheduling result comprises a network bandwidth real-time allocation graph, a task network delay prediction model and a dynamic resource adjustment scheme.
As a further scheme of the invention, based on the content of the migration data, the sensitive analysis is carried out on the data, the sensitive data comprising personal identity information and financial data is identified, the security and privacy protection of the data are optimized, and the step of generating the sensitive data index is specifically as follows:
s101: based on the migration data content, analyzing and identifying personal identity information and financial data in the data set, matching the data sensitivity level, and generating a sensitive information identification result;
S102: based on the sensitive information identification result, performing risk assessment on a plurality of migration data items, wherein the risk assessment comprises influence and leakage probability caused by data leakage, and generating risk data assessment information;
s103: based on the risk data evaluation information, combining data encryption, access control and data desensitization, optimizing security and privacy protection measures of data, and generating a sensitive data index.
As a further scheme of the present invention, based on the sensitive data index, in combination with a user role and an access policy, data encryption is applied, security and access efficiency of data in transmission are optimized, and the step of generating encrypted mask data specifically includes:
s201: based on the sensitive data index, evaluating the access requirements of a plurality of user roles on the sensitive data, and generating a role access policy table for matching access policies of various roles;
S202: configuring encryption levels for a plurality of roles based on the role access policy table, wherein the encryption levels comprise data access permission and data transmission encryption parameters, and generating data encryption configuration;
S203: and based on the data encryption configuration, carrying out data encryption on the real-time data migration task, optimizing the data processing speed and the safety in the encryption process, and generating the encrypted mask data.
As a further scheme of the invention, the encrypted mask data is utilized to identify the input and output data requirements of a plurality of real-time data migration tasks, the dependency relationship among the plurality of real-time data migration tasks is analyzed, and the step of generating a task dependency relationship graph specifically comprises the following steps:
s301: identifying the key data flow and the attribute in the real-time data migration task based on the encrypted mask data and identifying the input and the output of a plurality of data migration tasks, and generating a data flow attribute table;
S302: based on the data stream attribute table, analyzing interdependencies and data streams among data streams, identifying direct and indirect dependencies among tasks, and generating a dependent data stream analysis result;
S303: and optimizing and adjusting the dependency relationship based on the analysis result of the dependent data stream, including sequencing the data stream and configuring the resource to generate a task dependency relationship graph.
As a further scheme of the invention, based on the task dependency graph, the requirements of a plurality of data migration tasks on the computing resources are evaluated, the computing resource allocation is optimized by combining the execution progress and the resource utilization rate of the migration tasks, and the step of generating a computing resource allocation table specifically comprises the following steps:
S401: based on the task dependency graph, identifying the demands of a plurality of real-time data migration tasks on computing resources, including a CPU, a memory and a storage, and creating a resource demand evaluation table;
s402: based on the resource demand evaluation table, analyzing the execution progress of the real-time data migration task and the utilization rate of the computing resources, and carrying out matching analysis on the demand and supply of the computing resources to generate a resource matching analysis table;
S403: and optimizing a resource allocation strategy based on the resource matching analysis table, adjusting the resource allocation of the data migration task, matching the task requirement and the priority, and generating a computing resource allocation table.
As a further aspect of the present invention, based on the computing resource allocation table, by analyzing the generation time, the completion deadline and the resource requirement of the data stream, the urgency degree and the processing priority of the data stream are identified, and the step of generating the data stream priority analysis result specifically includes:
S501: analyzing the generation time and the expected completion period of the data stream in a plurality of real-time data migration tasks based on the computing resource allocation table, and generating a data stream time attribute table;
S502: analyzing the urgency of the data stream based on the data stream time attribute table, and generating a data stream urgency rating result by combining business influence and time sensitivity;
S503: and based on the data stream urgency rating result, combining with the original resource requirement, adopting a FIFO algorithm to evaluate and set the processing priorities of a plurality of data streams, and generating a data stream priority rating result.
The FIFO algorithm follows the formula:
a processing priority score for the data stream is calculated, wherein, For a data stream for which a processing priority is calculated,For data flowIs a priority score of (a) and (b),As a scaling factor for the time factor,For data flowThe length of time since the generation,In order for the time to be a smoothing constant,As the weight coefficient of the business impact,For data flowIs a business impact score of (1),As a weight coefficient for the time sensitivity,For data flowIs used for the time sensitivity score of (a),The weight coefficient for the urgency rating,For data flowIs used for the emergency rating of (a),In order to predict the completion deadline weight coefficient,For data flowThe difference between the expected completion deadline and the current time.
As a further scheme of the present invention, based on the data flow priority analysis result, the network and bandwidth allocation of a plurality of data migration tasks are adjusted in real time, the utilization efficiency of network resources is optimized, and the step of generating a real-time network scheduling result specifically includes:
S601: based on the data flow priority analysis result, evaluating the consistency of real-time network resource allocation and priority demands, identifying resource reconfiguration demands and generating a network resource adjustment scheme;
S602: according to the network resource adjustment scheme, bandwidth and route allocation schemes are adjusted, resource allocation of a plurality of data migration tasks is optimized, and a network bandwidth configuration result is generated;
s603: based on the network bandwidth configuration result, the influence of the adjusted network configuration on the data migration efficiency is monitored and analyzed in real time, and the computing resource allocation strategy is optimized and adjusted to generate a real-time network scheduling result.
A computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps for a live data migration method as described above.
Compared with the prior art, the invention has the advantages and positive effects that:
According to the invention, the sensitivity analysis is carried out on the data, the data safety and privacy protection level are improved, the data safety during transmission among a plurality of systems and platforms is ensured, the emergency degree and the processing priority of the data stream are identified by analyzing the generation time, the finishing period and the resource requirement of the data stream, the flexibility and the response speed of the data stream management are optimized, the efficiency of a real-time data migration task is improved, the utilization efficiency of network resources is optimized by calculating the resource allocation and adjusting the real-time network bandwidth, the data consistency is improved, the access delay is reduced, necessary support is provided for data intensive operation, and the delay and the resource waste of task execution are reduced.
Drawings
FIG. 1 is a schematic workflow diagram of the present invention;
FIG. 2 is a S1 refinement flowchart of the present invention;
FIG. 3 is a S2 refinement flowchart of the present invention;
FIG. 4 is a S3 refinement flowchart of the present invention;
FIG. 5 is a S4 refinement flowchart of the present invention;
FIG. 6 is a S5 refinement flowchart of the present invention;
Fig. 7 is a S6 refinement flowchart of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
In the description of the present invention, it should be understood that the terms "length," "width," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like indicate orientations or positional relationships based on the orientation or positional relationships shown in the drawings, merely to facilitate describing the present invention and simplify the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and therefore should not be construed as limiting the present invention. Furthermore, in the description of the present invention, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
Example 1
Referring to fig. 1, the present invention provides a technical solution for a live data migration method, including the following steps:
S1: based on the content of the migration data, performing sensitivity analysis on the data, identifying sensitive data comprising personal identity information and financial data, optimizing the security and privacy protection of the data, and generating a sensitive data index;
s2: based on the sensitive data index, combining the user role and the access strategy, applying data encryption, optimizing the security and the access efficiency of the data in transmission, and generating encrypted mask data;
s3: identifying input and output data requirements of a plurality of real-time data migration tasks by utilizing the encrypted mask data, analyzing the dependency relationship among the plurality of real-time data migration tasks, and generating a task dependency relationship graph;
S4: based on the task dependency graph, the demands of a plurality of data migration tasks on the computing resources are evaluated, and the computing resource allocation is optimized by combining the execution progress of the migration tasks and the resource utilization rate, so that a computing resource allocation table is generated;
s5: based on the computing resource allocation table, identifying the urgency degree and the processing priority of the data stream by analyzing the generation time, the finishing deadline and the resource requirement of the data stream, and generating a data stream priority analysis result;
s6: based on the data flow priority analysis result, the network and bandwidth allocation of a plurality of data migration tasks are adjusted in real time, the utilization efficiency of network resources is optimized, and a real-time network scheduling result is generated.
The sensitive data index comprises a personal identity information identification result, a financial data classification result and data sensitivity level evaluation information, the encrypted mask data comprises a data access authority table based on role access control, an encrypted data set and data retrieval speed optimization parameters, the task dependency graph comprises a task-to-task data flow graph, a task execution sequence and resource dependency detail, the computing resource allocation table comprises an allocated CPU time segment, a memory allocation condition and a resource reallocation plan, the data flow priority analysis result comprises a priority task list, a task emergency processing requirement and a predicted resource adjustment strategy, and the real-time network scheduling result comprises a network bandwidth real-time allocation graph, a task network delay prediction model and a dynamic resource adjustment scheme.
Referring to fig. 2, based on the migration data content, the data is subjected to sensitivity analysis, sensitive data including personal identity information and financial data is identified, security and privacy protection of the data are optimized, and the step of generating a sensitive data index is specifically as follows:
S101: based on the migration data content, analyzing and identifying personal identity information and financial data in the data set, matching the data sensitivity level, and generating a sensitive information identification result, wherein the process is specifically;
In the S101 substep, based on the migration data content, natural language processing technology is adopted, a NLTK library in Python programming language is used for part-of-speech tagging and entity recognition, personal identity information and financial data are recognized through a custom sensitive word dictionary, data items are classified, recognition results are mapped to data sensitivity levels, and sensitive information recognition results are generated.
S102: based on the sensitive information identification result, performing risk assessment on a plurality of migration data items, wherein the risk assessment comprises influence and leakage probability caused by data leakage, and the process of generating risk data assessment information is specifically as follows;
In the S102 substep, based on the identification result of the sensitive information, a statistical analysis method is adopted, a SciPy library in Python is used for data analysis, the leakage influence score and the leakage probability of each group of data are calculated, the risk is rated on the data by setting a threshold value, the data with high risk are marked, and risk data evaluation information is generated.
S103: based on risk data evaluation information, combining data encryption, access control and data desensitization, optimizing security and privacy protection measures of data, and generating a sensitive data index specifically comprises the following steps of;
In the S103 substep, based on risk data evaluation information, an OpenSSL library is used for performing AES encryption operation in combination with a data encryption and access control technology, access rights are set for data, a data desensitization method is used for processing the data, and data security is optimized through encryption and desensitization dual strategies, so that a sensitive data index is generated.
Referring to fig. 3, based on the sensitive data index, in combination with the user role and the access policy, the steps of applying data encryption to optimize the security and access efficiency of data in transmission and generating encrypted mask data are specifically as follows:
S201: based on the sensitive data index, evaluating the access requirements of a plurality of user roles on the sensitive data, and matching the access policies for a plurality of roles, wherein the process of generating the role access policy table is specifically as follows;
In the S201 substep, based on the sensitive data index, the access requirement of the differentiated user roles to the sensitive data is evaluated, the differentiated data access policy is set by using SQL commands according to the service functions and the data access frequency of the roles, and the access rights are allocated to the user roles to generate a role access policy table.
S202: based on the role access policy table, configuring encryption levels for a plurality of roles, including setting data access rights and data transmission encryption parameters, and generating a data encryption configuration flow specifically comprises the following steps of;
In the S202 substep, based on the role access policy table, different encryption levels are configured for multiple roles, the OpenSSL library is used for adjusting data encryption parameters, an encryption algorithm and a key length in data transmission are set, data encryption is configured through commands, the safety in the data transmission process is guaranteed, and data encryption configuration is generated.
S203: based on data encryption configuration, carrying out data encryption on a real-time data migration task, and optimizing the data processing speed and the safety in the encryption process, wherein the process of generating the encrypted mask data is specifically as follows;
in the S203 substep, based on the data encryption configuration, data encryption is carried out on the real-time data migration task, the data processing speed and the safety in the encryption process are optimized, the operation parameters of an encryption algorithm are adjusted, the encryption operation is ensured not to influence the data transmission efficiency, and the encrypted mask data is generated.
Referring to fig. 4, using encrypted mask data, input and output data requirements of a plurality of live data migration tasks are identified, and a dependency relationship among the plurality of live data migration tasks is analyzed, so as to generate a task dependency relationship graph, which specifically includes:
S301: identifying the input and output of a plurality of data migration tasks based on the encrypted mask data, and identifying key data streams and attributes in the real-time data migration tasks, wherein the flow for generating a data stream attribute table is specifically as follows;
In the S301 substep, based on the encrypted mask data, a data stream analysis technology is adopted, a Java programming language is used for matching with APACHEKAFKA stream processing frames, input and output of a data migration task are identified, key data streams and data attributes are identified through analyzing data packet header information, attribute tagging is carried out on the data streams, data stream dynamics are tracked in real time, and a data stream attribute table is generated.
S302: based on the data stream attribute table, analyzing the interdependence and data stream among the data streams, and identifying the direct and indirect dependence among the tasks, wherein the process for generating the analysis result of the dependent data stream is specifically as follows;
In the S302 substep, based on a data stream attribute table, a graph theory analysis method is adopted, a NetworkX library in a Python programming language is used for constructing a data stream dependency graph, the connection relation between data stream nodes and edges is analyzed, direct and indirect dependencies among data streams are identified through matrix operation, the data stream among tasks is analyzed, and a dependent data stream analysis result is generated.
S303: optimizing and adjusting the dependency relationship based on the analysis result of the dependent data stream, wherein the process of sequencing the data stream and configuring the resource to generate a task dependency relationship graph comprises the following steps of;
in the S303 substep, based on the analysis result of the dependent data stream, a priority scheduling algorithm is adopted, a C++ programming language is used for marking the priority of the data stream, the ordering of the data stream and the resource configuration are adjusted, and the execution sequence of the data processing tasks is reconfigured through an optimization algorithm for calculating the utilization efficiency of the resources, so as to generate a task dependency graph.
Referring to fig. 5, based on the task dependency graph, the method evaluates the demands of a plurality of data migration tasks on computing resources, optimizes computing resource allocation in combination with execution progress and resource utilization rate of the migration tasks, and generates a computing resource allocation table specifically including:
S401: based on the task dependency graph, identifying the demands of a plurality of real-time data migration tasks on computing resources, wherein the demands comprise a CPU, a memory and a storage, and the process of creating a resource demand evaluation table is specifically as follows;
in the S401 substep, based on the task dependency graph, a resource demand prediction technology is adopted, an R language and a linear regression model are used for analyzing the demands of a plurality of data migration tasks on computing resources, the demands comprise CPU utilization rate prediction, memory consumption mode analysis and storage demand estimation, a plurality of task resource demands are determined through statistical analysis, and a resource demand evaluation table is generated.
S402: based on the resource demand evaluation table, analyzing the execution progress of the real-time data migration task and the utilization rate of the computing resources, and carrying out matching analysis on the demand and supply of the computing resources, wherein the process for generating the resource matching analysis table is specifically as follows;
In the S402 substep, based on the resource demand evaluation table, a resource matching algorithm is adopted, a Python programming language and SciPy optimizing library are used, the execution progress of the real-time data migration task and the utilization rate of computing resources are analyzed, the resource supply and demand are matched through a dynamic adjustment algorithm, the peak value and the valley value of the utilization of the resources are analyzed, the resource allocation efficiency is optimized, and a resource matching analysis table is generated.
S403: optimizing a resource allocation strategy based on a resource matching analysis table, adjusting the resource allocation of a data migration task, matching task requirements and priorities, and generating a flow of a computing resource allocation table specifically comprises the following steps of;
In the S403 substep, based on the resource matching analysis table, a dynamic resource allocation strategy is adopted, a Shell script and a Linux system command are used for adjusting the resource allocation of the data migration task, and according to the task demand and the priority, the configuration computing resources comprise CPU allocation, memory management and storage space allocation, and a computing resource allocation table is generated.
Referring to fig. 6, based on the computing resource allocation table, by analyzing the generation time, the completion deadline and the resource requirement of the data stream, the urgency and the processing priority of the data stream are identified, and the step of generating the data stream priority analysis result is specifically as follows:
s501: based on a computing resource allocation table, analyzing the generation time and the expected completion period of the data stream in a plurality of real-time data migration tasks, wherein the process for generating the data stream time attribute table is specifically as follows;
In the S501 substep, based on a computing resource allocation table, a time sequence analysis technology is adopted, a Pandas library in a Python programming language is used for processing time data, the data stream time of a plurality of data migration tasks and the expected completion period are analyzed, a time line is arranged through a time stamp comparison and sorting algorithm, and a data stream time attribute table is generated.
S502: based on the data stream time attribute table, analyzing the urgency degree of the data stream, and combining the business influence and the time sensitivity, the process for generating the urgency rating result of the data stream is specifically as follows;
In the S502 substep, based on the data stream time attribute table, an urgency rating algorithm is adopted, a Java programming language is used, service influence and time sensitivity factors are combined to sort and classify the data streams, urgency indexes are calculated, the urgency degree of the data streams is evaluated, and a data stream urgency rating result is generated.
S503: based on the data stream urgency rating result, combining with the original resource requirement, adopting a FIFO algorithm to evaluate and set the processing priority of a plurality of data streams, and generating a data stream priority analysis result specifically comprises the following steps of;
in the step S503, based on the data stream emergency rating result, a priority queue management policy is adopted, NET Framework implementation in the c# programming language is used, the data stream processing priority is adjusted in combination with the original resource demand evaluation, and the processing priorities of a plurality of data streams are evaluated and set by adopting a FIFO algorithm, so as to generate a data stream priority analysis result.
FIFO algorithm, according to formula:
a processing priority score for the data stream is calculated, wherein, For a data stream for which a processing priority is calculated,For data flowFor determining the order of processing the plurality of data streams,Scaling factor being a time factor for adjusting timeThe strength of the impact on the priority score,For data flowThe length of time since generation is a direct indicator of how fast the data stream is considered,For time smoothing constant, add toTo avoid division by zero, and to provide a smooth time impact for the new data stream,For the weight coefficient of business influence, determineThe size of the contribution in the total score,For data flowTo evaluate the extent of impact of the data stream on the operation of the service,For time sensitive weighting coefficients, determinationThe size of the contribution in the total score,For data flowIs indicative of the time sensitivity of the data stream processing,Weight coefficient for emergency rating, decisionThe size of the contribution in the total score,For data flowBy deriving directly from the business system and rating mechanism, reflecting the urgency of the data stream processing,To predict the completion deadline weight coefficient, determineThe size of the contribution in the total score,For data flowThe difference between the expected completion deadline and the current time is used to evaluate the urgency of completion of the data stream.
The specific implementation process of the improved formula is as follows:
determining weights by multiple linear regression analysis using historical completion times, business impact records, and urgency ratings data 、、、Scaling factorSmoothing constantFinding out the optimal parameter value, minimizing the prediction error, and calculatingA comprehensive score is provided for the data stream, and the processing sequence of the data stream is ordered according to the score, so that the most critical and urgent task priority processing is ensured, the resource allocation is optimized, and the task execution efficiency is improved.
Referring to fig. 7, based on the data flow priority analysis result, the network and bandwidth allocation of the plurality of data migration tasks are adjusted in real time, the utilization efficiency of network resources is optimized, and the step of generating the real-time network scheduling result specifically includes:
S601: based on the data flow priority analysis result, evaluating the consistency of real-time network resource allocation and priority demand, and identifying resource reconfiguration demand, wherein the flow for generating a network resource adjustment scheme is specifically as follows;
in the S601 substep, based on the data flow priority analysis result, a network resource management technology is adopted, a Python programming language is used for carrying out data processing in combination with NumPy library, the consistency of real-time network resource allocation and priority requirements is evaluated, and the resource reconfiguration requirements are identified through network flow analysis and resource utilization rate calculation, so as to generate a network resource adjustment scheme.
S602: according to a network resource adjustment scheme, adjusting a bandwidth and route allocation scheme, optimizing resource allocation of a plurality of data migration tasks, and generating a network bandwidth configuration result, wherein the process is specifically as follows;
In the step S602, based on a network resource adjustment scheme, a network optimization technology is adopted to perform bandwidth and route allocation adjustment, a network device configuration file is set through a command line interface, resource allocation of a plurality of data migration tasks is optimized, network bandwidth and route parameters are adjusted, and a network bandwidth configuration result is generated.
S603: based on the network bandwidth configuration result, monitoring and analyzing the influence of the adjusted network configuration on the data migration efficiency in real time, optimizing and adjusting the computing resource allocation strategy, and generating a real-time network scheduling result specifically comprises the following steps of;
In the step S603, based on the network bandwidth configuration result, a real-time monitoring technology is adopted, a JavaScript programming language and a WebSocket technology are used for carrying out real-time updating and analysis on the network state, and the influence of the adjusted network configuration on the data migration efficiency is analyzed through data packet capturing and transmission efficiency calculation, so that a calculation resource allocation strategy is optimized and adjusted, and a real-time network scheduling result is generated.
The present invention is not limited to the above embodiments, and any equivalent embodiments which can be changed or modified by the technical disclosure described above can be applied to other fields, but any simple modification, equivalent changes and modification made to the above embodiments according to the technical matter of the present invention will still fall within the scope of the technical disclosure.
Claims (6)
1. A method for live data migration, comprising the steps of:
Based on the content of the migration data, performing sensitivity analysis on the data, identifying sensitive data comprising personal identity information and financial data, optimizing the security and privacy protection of the data, and generating a sensitive data index;
based on the sensitive data index, combining a user role and an access strategy, applying data encryption, optimizing the security and the access efficiency of data in transmission, and generating encrypted mask data;
Identifying input and output data requirements of a plurality of real-time data migration tasks by using the encrypted mask data, analyzing the dependency relationship among the plurality of real-time data migration tasks, and generating a task dependency relationship graph;
Based on the task dependency graph, the demands of a plurality of data migration tasks on the computing resources are evaluated, and the computing resource allocation is optimized by combining the execution progress of the migration tasks and the resource utilization rate, so that a computing resource allocation table is generated;
Based on the computing resource allocation table, identifying the urgency degree and the processing priority of the data stream by analyzing the generation time, the finishing deadline and the resource requirement of the data stream, and generating a data stream priority analysis result;
based on the data flow priority analysis result, the network and bandwidth allocation of a plurality of data migration tasks are adjusted in real time, the utilization efficiency of network resources is optimized, and a real-time network scheduling result is generated;
The sensitive data index comprises a personal identity information identification result, a financial data classification result and data sensitivity level evaluation information, the encrypted mask data comprises a data access authority table based on role access control, an encrypted data set and data retrieval speed optimization parameters, the task dependency graph comprises a task-to-task data flow graph, a task execution sequence and resource dependency detail, the computing resource allocation table comprises an allocated CPU time segment, a memory allocation condition and a resource reallocation plan, the data flow priority analysis result comprises a priority task list, a task emergency processing requirement and a predicted resource adjustment strategy, and the real-time network scheduling result comprises a network bandwidth real-time allocation graph, a task network delay prediction model and a dynamic resource adjustment scheme;
identifying input and output data requirements of a plurality of real-time data migration tasks by using the encrypted mask data, analyzing the dependency relationship among the plurality of real-time data migration tasks, and generating a task dependency relationship graph specifically comprises the following steps:
Identifying the key data flow and the attribute in the real-time data migration task based on the encrypted mask data and identifying the input and the output of a plurality of data migration tasks, and generating a data flow attribute table;
based on the data stream attribute table, analyzing interdependencies and data streams among data streams, identifying direct and indirect dependencies among tasks, and generating a dependent data stream analysis result;
optimizing and adjusting the dependency relationship based on the analysis result of the dependent data stream, including sequencing the data stream and configuring the resource to generate a task dependency relationship graph;
Based on the computing resource allocation table, the emergency degree and the processing priority of the data stream are identified by analyzing the generation time, the finishing deadline and the resource requirement of the data stream, and the step of generating the data stream priority analysis result specifically comprises the following steps:
Analyzing the generation time and the expected completion period of the data stream in a plurality of real-time data migration tasks based on the computing resource allocation table, and generating a data stream time attribute table;
analyzing the urgency of the data stream based on the data stream time attribute table, and generating a data stream urgency rating result by combining business influence and time sensitivity;
Based on the data stream emergency rating result, combining with the original resource requirement, adopting a FIFO algorithm to evaluate and set the processing priority of a plurality of data streams, and generating a data stream priority analysis result;
the FIFO algorithm follows the formula:
;
Calculating a processing priority score for the data stream, wherein i is the data stream for which processing priority is calculated, P i is the priority score for data stream i, As a scaling factor for the time factor, T i is the length of time since the generation of the data i stream,In order for the time to be a smoothing constant,For the weight coefficient of the traffic impact, B i is the traffic impact score for data stream i,For a weight coefficient of time sensitivity, S i scores the time sensitivity of data stream i,For the weight coefficient of the urgency rating, U i is the urgency rating for data stream i,For the predicted completion deadline weight coefficient, E i is the difference between the predicted completion deadline and the current time of data stream i.
2. The method for live data migration according to claim 1, wherein the step of performing sensitivity analysis on the data based on the migrated data content, identifying sensitive data including personal identity information and financial data, optimizing security and privacy protection of the data, and generating a sensitive data index is specifically:
based on the migration data content, analyzing and identifying personal identity information and financial data in the data set, matching the data sensitivity level, and generating a sensitive information identification result;
based on the sensitive information identification result, performing risk assessment on a plurality of migration data items, wherein the risk assessment comprises influence and leakage probability caused by data leakage, and generating risk data assessment information;
Based on the risk data evaluation information, combining data encryption, access control and data desensitization, optimizing security and privacy protection measures of data, and generating a sensitive data index.
3. The method for live data migration according to claim 1, wherein the step of generating the encrypted mask data by applying data encryption in combination with user roles and access policies based on the sensitive data index, optimizing security and access efficiency of the data in transmission is specifically:
based on the sensitive data index, evaluating the access requirements of a plurality of user roles on the sensitive data, and generating a role access policy table for matching access policies of various roles;
configuring encryption levels for a plurality of roles based on the role access policy table, wherein the encryption levels comprise data access permission and data transmission encryption parameters, and generating data encryption configuration;
And based on the data encryption configuration, carrying out data encryption on the real-time data migration task, optimizing the data processing speed and the safety in the encryption process, and generating the encrypted mask data.
4. The method for real-time data migration according to claim 1, wherein the step of evaluating the demands of the plurality of data migration tasks on the computing resources based on the task dependency graph, optimizing the computing resource allocation in combination with the execution progress and the resource utilization of the migration tasks, and generating the computing resource allocation table is specifically as follows:
Based on the task dependency graph, identifying the demands of a plurality of real-time data migration tasks on computing resources, including a CPU, a memory and a storage, and creating a resource demand evaluation table;
based on the resource demand evaluation table, analyzing the execution progress of the real-time data migration task and the utilization rate of the computing resources, and carrying out matching analysis on the demand and supply of the computing resources to generate a resource matching analysis table;
and optimizing a resource allocation strategy based on the resource matching analysis table, adjusting the resource allocation of the data migration task, matching the task requirement and the priority, and generating a computing resource allocation table.
5. The method for live data migration according to claim 1, wherein the step of adjusting network and bandwidth allocation of a plurality of data migration tasks in real time based on the data flow priority analysis result, optimizing the utilization efficiency of network resources, and generating a live network scheduling result is specifically:
based on the data flow priority analysis result, evaluating the consistency of real-time network resource allocation and priority demands, identifying resource reconfiguration demands and generating a network resource adjustment scheme;
according to the network resource adjustment scheme, bandwidth and route allocation schemes are adjusted, resource allocation of a plurality of data migration tasks is optimized, and a network bandwidth configuration result is generated;
Based on the network bandwidth configuration result, the influence of the adjusted network configuration on the data migration efficiency is monitored and analyzed in real time, and the computing resource allocation strategy is optimized and adjusted to generate a real-time network scheduling result.
6. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps for a live data migration method according to any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410586693.0A CN118193502B (en) | 2024-05-13 | 2024-05-13 | Method and computer readable medium for real-time data migration |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410586693.0A CN118193502B (en) | 2024-05-13 | 2024-05-13 | Method and computer readable medium for real-time data migration |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118193502A CN118193502A (en) | 2024-06-14 |
CN118193502B true CN118193502B (en) | 2024-07-09 |
Family
ID=91393076
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410586693.0A Active CN118193502B (en) | 2024-05-13 | 2024-05-13 | Method and computer readable medium for real-time data migration |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118193502B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118400724B (en) * | 2024-06-25 | 2024-08-23 | 国家海洋信息中心 | Ocean emergency communication scheduling method and system |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114255498A (en) * | 2021-12-13 | 2022-03-29 | 厦门美图之家科技有限公司 | Human face shape migration method, device and equipment based on CNN |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015131242A1 (en) * | 2014-03-06 | 2015-09-11 | David Burton | Mobile data management system |
-
2024
- 2024-05-13 CN CN202410586693.0A patent/CN118193502B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114255498A (en) * | 2021-12-13 | 2022-03-29 | 厦门美图之家科技有限公司 | Human face shape migration method, device and equipment based on CNN |
Also Published As
Publication number | Publication date |
---|---|
CN118193502A (en) | 2024-06-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN118193502B (en) | Method and computer readable medium for real-time data migration | |
Sathiyamoorthi et al. | Adaptive fault tolerant resource allocation scheme for cloud computing environments | |
Ramamoorthi | AI-Driven Cloud Resource Optimization Framework for Real-Time Allocation | |
CN118363765B (en) | Cloud resource automatic allocation system | |
Rahmani et al. | Burst‐aware virtual machine migration for improving performance in the cloud | |
Dogani et al. | K-agrued: A container autoscaling technique for cloud-based web applications in kubernetes using attention-based gru encoder-decoder | |
Dias et al. | A systematic literature review on virtual machine consolidation | |
Li et al. | An optimization framework for migrating and deploying multiclass enterprise applications into the cloud | |
Kumar et al. | A Hybrid Eagle’s Web Swarm Optimization (EWSO) technique for effective cloud resource management | |
Kothi Laxman et al. | PGWO‐AVS‐RDA: An intelligent optimization and clustering based load balancing model in cloud | |
Kashyap et al. | Prediction-based scheduling techniques for cloud data center’s workload: a systematic review | |
Rahmani et al. | A novel offloading strategy for multi-user optimization in blockchain-enabled Mobile Edge Computing networks for improved Internet of Things performance | |
Petrovska et al. | Sequential Series-Based Prediction Model in Adaptive Cloud Resource Allocation for Data Processing and Security | |
Saxena et al. | A high up-time and security centered resource provisioning model towards sustainable cloud service management | |
Mohammed et al. | Trust model for cloud service consumers | |
Islam et al. | Dynamic scheduling approach for data-intensive cloud environment | |
Al Qassem et al. | Containerized Microservices: A Survey of Resource Management Frameworks | |
Ray et al. | Reverse engineering technique (RET) to predict resource allocation in a Google cloud system | |
Pu et al. | An elastic framework construction method based on task migration in edge computing | |
Cheng et al. | Two-Stage Distributionally Robust Edge Node Placement Under Endogenous Demand Uncertainty | |
Zhu et al. | Performance-Power Tradeoff in Heterogeneous SaaS Clouds With Trustworthiness Guarantee | |
CN118331750B (en) | Dynamic resource allocation system, electronic equipment and storage medium for processing network threat | |
CN118467180B (en) | Multi-tenant data management method | |
US20240303130A1 (en) | Systems and methods for edge resource demand load scheduling | |
US20240303128A1 (en) | Systems and methods for hypergraph edge resource demand load representation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |