CN112699080A - High-speed multi-path network data migration method - Google Patents
High-speed multi-path network data migration method Download PDFInfo
- Publication number
- CN112699080A CN112699080A CN202110030467.0A CN202110030467A CN112699080A CN 112699080 A CN112699080 A CN 112699080A CN 202110030467 A CN202110030467 A CN 202110030467A CN 112699080 A CN112699080 A CN 112699080A
- Authority
- CN
- China
- Prior art keywords
- data
- cluster
- index
- index file
- migration
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013508 migration Methods 0.000 title claims abstract description 38
- 230000005012 migration Effects 0.000 title claims abstract description 38
- 238000000034 method Methods 0.000 title claims abstract description 18
- 238000000605 extraction Methods 0.000 claims abstract description 20
- 239000012634 fragment Substances 0.000 claims abstract description 19
- 239000000284 extract Substances 0.000 claims abstract description 13
- 230000005540 biological transmission Effects 0.000 claims description 8
- 238000012545 processing Methods 0.000 claims description 7
- 230000011218 segmentation Effects 0.000 claims description 3
- 238000013075 data extraction Methods 0.000 abstract description 4
- 230000006870 function Effects 0.000 description 5
- 238000012546 transfer Methods 0.000 description 5
- 238000013507 mapping Methods 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012950 reanalysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/11—File system administration, e.g. details of archiving or snapshots
- G06F16/119—Details of migration of file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
Abstract
The invention relates to a high-speed multi-path network flow data migration method, which comprises the following steps: extracting an index file and configuring corresponding information according to an extraction rule condition by a system where an ES cluster needing data migration is logged in; equally dividing the extracted index files of the data to be migrated according to the available number of cluster servers, issuing the divided index file fragments to the data migration cluster, and allocating the index file fragment meanings to each server in the cluster by the cluster; each server extracts corresponding data from the original cluster according to the distributed index file fragments, receives the cluster server receiving the migrated data, resets the data offset position and stores the data offset position; and feeding back the state of the migration data after the migration data is completed. The invention has the advantages that: the indexes are distributed to different servers, and each server independently extracts original data, so that the data extraction speed is improved.
Description
Technical Field
The invention relates to the technical field of data processing, in particular to a high-speed multi-path network data migration method.
Background
With the continuous development of modern informatization and digitization technologies, more and more data need to be collected, the data generation speed is older and faster, and the stored data volume is doubled; when equipment needs to be upgraded or data needs to be backed up, the stored data needs to be migrated; in the traditional data migration, only files are copied, so that the problems of IO port use bottleneck, data reanalysis and creation after migration and the like can be met, and the efficiency is extremely low; therefore, a better and more efficient network traffic data migration solution is urgently needed.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides a high-speed multi-path network traffic data migration method and overcomes the defects of the prior data migration method.
The purpose of the invention is realized by the following technical scheme: a high-speed multi-path network traffic data migration method comprises the following steps:
extracting an index file and configuring corresponding information according to an extraction rule condition by a system where an ES cluster needing data migration is logged in;
equally dividing the extracted index files of the data to be migrated according to the available number of cluster servers, issuing the divided index file fragments to the data migration cluster, and allocating the index file fragment meanings to each server in the cluster by the cluster;
each server extracts corresponding data from the original cluster according to the distributed index file fragments, receives the cluster server receiving the migrated data, resets the data offset position and stores the data offset position;
and feeding back the state of the migration data after the migration data is completed.
Further, the step of extracting the index file and configuring the corresponding information by the system where the ES cluster requiring data migration for logging in is located according to the extraction rule condition includes:
a user logs in a system where an ES cluster of data to be migrated is located, sets extraction rule conditions, executes extraction inquiry and extracts a traction file of the data to be migrated;
and configuring an ES cluster IP, a port and a login password which need to migrate in data, and verifying whether the configuration information is correctly input.
Further, the equally dividing the extracted index files of the data to be migrated according to the available number of the cluster servers includes:
dividing the index file according to a dividing mode of dividing the index dividing size into the index number/(the number of available servers + 1);
and traversing from the first fragment, and putting the remainder index files calculated according to the segmentation mode one by one until all the remainder index files are divided into the fragments.
Further, the extracting, by each server, corresponding data from the original cluster according to the distributed index file shards includes:
a1, reading the index file, and extracting secondary index data according to the offset of the index file;
a2, circularly reading secondary index data, and extracting original data according to secondary index offset;
a3, sequentially and respectively sending the original data, the secondary index and the primary index to a cluster server to which the data is migrated;
a4, after the data transmission is completed, reading the next index file, and repeating the steps A1-A3 until the index file is processed;
and A5, deleting the distributed index file, and returning the processing state to the current cluster system.
Further, the step of receiving the data by the cluster server receiving the immigration data, resetting the data offset position, and then saving includes:
receiving original data and storing the data;
receiving secondary index data, storing offset according to the original data, and modifying the secondary index offset;
receiving primary index data, storing the offset according to the secondary index, and modifying the offset of the primary index;
and saving the received data to the cluster file system.
The invention has the following advantages: a high-speed multipath network traffic data migration method distributes indexes to different servers, each server independently extracts original data, and the data extraction speed is improved; each server independently uploads a primary index, a secondary index and original data, so that the transmission speed is increased; CRC (cyclic redundancy check) is added, so that errors occurring in the transmission process are reduced, and migration failure is caused; after the data migration is completed, mapping association is automatically reestablished, and manual processing of the migrated data is not needed.
Drawings
FIG. 1 is a schematic flow chart of the present invention.
Detailed Description
The invention will be further described with reference to the accompanying drawings, but the scope of the invention is not limited to the following.
As shown in fig. 1, the present invention relates to a high-speed multi-path network traffic data migration method, which implements a function of performing high-speed multi-path migration on an original network traffic data file with a built multi-stage ES index and hierarchical distributed storage. Specifically, the filtered index file can be distributed to different servers of the ES cluster according to different segments according to set migration conditions, each server independently extracts a secondary index file and original data, synchronously migrates the index file and the original data to a new ES cluster after extraction is completed, and modifies the mapping association of the original data and the index again after the new ES cluster receives all the data; the method specifically comprises the following steps:
s1, a user logs in a system where an ES cluster of data to be migrated is located, an extraction rule condition is set, the extraction rule condition can be a certain data mark, such as a fixed IP, an IP section, a fixed port, a port range, a home location, a protocol type and the like, and a multi-element combined data mark, the execution of extraction query is completed through the specified data mark, and finally an index file of the data to be migrated is extracted;
s2, configuring the ES cluster IP, the port and the login password of the data to be migrated, and verifying whether the configuration information is correctly input; and when the configuration information is verified to be correct, the subsequent processing flow can be carried out, and when the configuration information is verified to be wrong, the configuration information is prompted to be wrong, and the subsequent processing flow can be entered after the configuration information is input to be correct again.
S3, equally dividing the extracted index files of the data to be migrated according to the available number of the cluster servers, wherein the index file division algorithm is as follows:
dividing the index file according to a dividing mode of dividing the index dividing size into the index number/(the number of available servers + 1);
and traversing from the first fragment, and putting the remainder index files calculated according to the segmentation mode one by one until all the remainder index files are divided into the fragments.
S4, issuing the divided index file fragments to a data migration cluster, and distributing the index file fragments to each server in the cluster in a one-to-one correspondence manner by the cluster;
s5, each server extracts corresponding data from the original cluster according to the distributed index file fragments;
specifically, A1, reading an index file, and extracting secondary index data according to the offset of the index file;
a2, circularly reading secondary index data, and extracting original data according to secondary index offset;
a3, sequentially and respectively sending the original data, the secondary index and the primary index to a cluster server to which the data is migrated;
a4, after the data transmission is completed, reading the next index file, and repeating the steps A1-A3 until the index file is processed;
and A5, deleting the distributed index file, and returning the processing state to the current cluster system.
S6, the cluster server receiving the migrated data receives the data, resets the data offset position and finally stores the data in the storage system;
specifically, receiving original data and storing the data;
receiving secondary index data, storing offset according to the original data, and modifying the secondary index offset;
receiving primary index data, storing the offset according to the secondary index, and modifying the offset of the primary index;
and saving the received data to the cluster file system.
And S7, completing the migration of the data, and feeding back the state of the migration data.
The invention extracts data at high speed and in a multi-path concurrent manner, does not mutually occupy respective IO, CPU and memory, and can quickly transfer the extracted data to the server to be migrated into the cluster; the high-speed function is that only one extraction task is executed on a single server at the same time, and the data extraction is free of any external interference and high in speed; the multi-path function is that all servers in the cluster independently and simultaneously extract data, and independently transfer the extracted data to corresponding servers in the receiving cluster, and the extraction and the receiving servers form a corresponding relation, and all servers do not interfere with each other; the specific characteristic that the functions of respective IO, CPU and memory are not mutually occupied is that for extraction and reception, the cluster center only distributes tasks and does not participate in any transfer work of transferred data in the whole data transfer process. All servers in the extraction and receiving clusters are corresponding and independent, and the extraction, transmission and storage of data are in a point-to-point relationship; the function of rapidly transferring the extracted data to the cluster server to be migrated is specifically that the extraction server corresponds to the receiving server, and the data transmission adopts a point-to-point mode, so that the influence of a network middle layer on the transmission speed is greatly reduced. And the receiving server can carry out index remapping once when receiving the original files of a batch, thereby realizing the characteristic of using after the data extraction is finished.
The invention can extract data according to the conditions set by the user, the extracted data index file is distributed to each server receiving the data cluster service according to different segments, and then each server independently extracts the secondary index file and the original data from the target server and transfers the secondary index file and the original data to the server, thereby realizing high-speed multi-path concurrent extraction and migration of network flow data. And finally, after the data migration is finished, the new cluster reestablishes mapping association between the index and the original data, so that the whole data migration process is finished.
The foregoing is illustrative of the preferred embodiments of this invention, and it is to be understood that the invention is not limited to the precise form disclosed herein and that various other combinations, modifications, and environments may be resorted to, falling within the scope of the concept as disclosed herein, either as described above or as apparent to those skilled in the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (5)
1. A high-speed multi-path network flow data migration method is characterized in that: the data migration method comprises the following steps:
extracting an index file and configuring corresponding information according to an extraction rule condition by a system where an ES cluster needing data migration is logged in;
equally dividing the extracted index files of the data to be migrated according to the available number of cluster servers, issuing the divided index file fragments to the data migration cluster, and allocating the index file fragment meanings to each server in the cluster by the cluster;
each server extracts corresponding data from the original cluster according to the distributed index file fragments, receives the cluster server receiving the migrated data, resets the data offset position and stores the data offset position;
and feeding back the state of the migration data after the migration data is completed.
2. The method for migrating high-speed multi-path network traffic data according to claim 1, wherein: the step of extracting the index file and configuring corresponding information by the system where the ES cluster needing data migration is logged in according to the extraction rule condition comprises the following steps:
a user logs in a system where an ES cluster of data to be migrated is located, sets extraction rule conditions, executes extraction inquiry and extracts a traction file of the data to be migrated;
and configuring an ES cluster IP, a port and a login password which need to migrate in data, and verifying whether the configuration information is correctly input.
3. The method for migrating high-speed multi-path network traffic data according to claim 1, wherein: equally dividing the extracted index files of the data to be migrated according to the available number of the cluster servers comprises the following steps:
dividing the index file according to a dividing mode of dividing the index dividing size into the index number/(the number of available servers + 1);
and traversing from the first fragment, and putting the remainder index files calculated according to the segmentation mode one by one until all the remainder index files are divided into the fragments.
4. The method for migrating high-speed multi-path network traffic data according to claim 1, wherein: the step of extracting corresponding data from the original cluster by each server according to the distributed index file fragments comprises the following steps:
a1, reading the index file, and extracting secondary index data according to the offset of the index file;
a2, circularly reading secondary index data, and extracting original data according to secondary index offset;
a3, sequentially and respectively sending the original data, the secondary index and the primary index to a cluster server to which the data is migrated;
a4, after the data transmission is completed, reading the next index file, and repeating the steps A1-A3 until the index file is processed;
and A5, deleting the distributed index file, and returning the processing state to the current cluster system.
5. The method for migrating high-speed multi-path network traffic data according to claim 1, wherein: the step of receiving the data by the cluster server receiving the migrated data, resetting the data offset position and then storing the data comprises the following steps:
receiving original data and storing the data;
receiving secondary index data, storing offset according to the original data, and modifying the secondary index offset;
receiving primary index data, storing the offset according to the secondary index, and modifying the offset of the primary index;
and saving the received data to the cluster file system.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110030467.0A CN112699080A (en) | 2021-01-11 | 2021-01-11 | High-speed multi-path network data migration method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110030467.0A CN112699080A (en) | 2021-01-11 | 2021-01-11 | High-speed multi-path network data migration method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112699080A true CN112699080A (en) | 2021-04-23 |
Family
ID=75513733
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110030467.0A Pending CN112699080A (en) | 2021-01-11 | 2021-01-11 | High-speed multi-path network data migration method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112699080A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113282537A (en) * | 2021-06-15 | 2021-08-20 | 成都深思科技有限公司 | ES data migration system and migration method |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100070474A1 (en) * | 2008-09-12 | 2010-03-18 | Lad Kamleshkumar K | Transferring or migrating portions of data objects, such as block-level data migration or chunk-based data migration |
CN104348862A (en) * | 2013-07-31 | 2015-02-11 | 华为技术有限公司 | Data migration processing method, apparatus, and system |
CN105205154A (en) * | 2015-09-24 | 2015-12-30 | 浙江宇视科技有限公司 | Data migration method and device |
US20160253339A1 (en) * | 2015-02-26 | 2016-09-01 | Bittitan, Inc. | Data migration systems and methods including archive migration |
CN106973091A (en) * | 2017-03-23 | 2017-07-21 | 中国工商银行股份有限公司 | Distributed memory fast resampling method and system, main control server |
CN111064789A (en) * | 2019-12-18 | 2020-04-24 | 北京三快在线科技有限公司 | Data migration method and system |
-
2021
- 2021-01-11 CN CN202110030467.0A patent/CN112699080A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100070474A1 (en) * | 2008-09-12 | 2010-03-18 | Lad Kamleshkumar K | Transferring or migrating portions of data objects, such as block-level data migration or chunk-based data migration |
CN104348862A (en) * | 2013-07-31 | 2015-02-11 | 华为技术有限公司 | Data migration processing method, apparatus, and system |
US20160253339A1 (en) * | 2015-02-26 | 2016-09-01 | Bittitan, Inc. | Data migration systems and methods including archive migration |
CN105205154A (en) * | 2015-09-24 | 2015-12-30 | 浙江宇视科技有限公司 | Data migration method and device |
CN106973091A (en) * | 2017-03-23 | 2017-07-21 | 中国工商银行股份有限公司 | Distributed memory fast resampling method and system, main control server |
CN111064789A (en) * | 2019-12-18 | 2020-04-24 | 北京三快在线科技有限公司 | Data migration method and system |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113282537A (en) * | 2021-06-15 | 2021-08-20 | 成都深思科技有限公司 | ES data migration system and migration method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9830221B2 (en) | Restoration of erasure-coded data via data shuttle in distributed storage system | |
CN102209087B (en) | Method and system for MapReduce data transmission in data center having SAN | |
CN110555012B (en) | Data migration method and device | |
US7509322B2 (en) | Aggregated lock management for locking aggregated files in a switched file system | |
US20150149819A1 (en) | Parity chunk operating method and data server apparatus for supporting the same in distributed raid system | |
US20040133607A1 (en) | Metadata based file switch and switched file system | |
CN109327332B (en) | LIO-based iSCSI GateWay high-availability implementation method under Ceph cloud storage | |
US10712964B2 (en) | Pre-forking replicas for efficient scaling of a distributed data storage system | |
WO2008019952A2 (en) | Storage management system for preserving consistency of remote copy data | |
US11068537B1 (en) | Partition segmenting in a distributed time-series database | |
CN109918021B (en) | Data processing method and device | |
CN109271249B (en) | Cloud container pre-copy online migration method based on P.haul framework | |
CN112699080A (en) | High-speed multi-path network data migration method | |
EP3261302B1 (en) | Storage network element discovery method and device | |
CN105354110B (en) | Cloud Server data back up method and device | |
CN105791337B (en) | A kind of upgrade method, equipment and group system | |
US11079960B2 (en) | Object storage system with priority meta object replication | |
CN107908713A (en) | A kind of distributed dynamic cuckoo filtration system and its filter method based on Redis clusters | |
CN1243431C (en) | Analysis of universal route platform command lines | |
US10628242B1 (en) | Message stream processor microbatching | |
CN106855869B (en) | Method, device and system for realizing high availability of database | |
CN110162511B (en) | Log transmission method and related equipment | |
US8997124B2 (en) | Method for updating data in a distributed data storage system | |
US11093465B2 (en) | Object storage system with versioned meta objects | |
Kasu et al. | DLFT: Data and layout aware fault tolerance framework for big data transfer systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210423 |