CN113946553A - File synchronization method of ten-million-level file volume based on ftp - Google Patents

File synchronization method of ten-million-level file volume based on ftp Download PDF

Info

Publication number
CN113946553A
CN113946553A CN202111252650.1A CN202111252650A CN113946553A CN 113946553 A CN113946553 A CN 113946553A CN 202111252650 A CN202111252650 A CN 202111252650A CN 113946553 A CN113946553 A CN 113946553A
Authority
CN
China
Prior art keywords
file
ftp
information
server
redis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111252650.1A
Other languages
Chinese (zh)
Inventor
王玉伟
单震
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chaozhou Zhuoshu Big Data Industry Development Co Ltd
Original Assignee
Chaozhou Zhuoshu Big Data Industry Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chaozhou Zhuoshu Big Data Industry Development Co Ltd filed Critical Chaozhou Zhuoshu Big Data Industry Development Co Ltd
Priority to CN202111252650.1A priority Critical patent/CN113946553A/en
Publication of CN113946553A publication Critical patent/CN113946553A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/178Techniques for file synchronisation in file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a file synchronization method of ten-million-level file volume based on ftp, and belongs to the technical field of computer and file synchronization. The file synchronization method of tens of millions of files based on ftp introduces a redis library into a file synchronization program, and adopts a special memory library to finish file information storage, comparison program and file synchronization program splitting. The file synchronization method of tens of millions of files based on ftp supports automatic synchronization of large-batch files, improves the reliability of large-batch file synchronization, avoids manual synchronization, wastes time and labor, is easy to make mistakes, and has good popularization and application values.

Description

File synchronization method of ten-million-level file volume based on ftp
Technical Field
The invention relates to the technical field of computer and file synchronization, and particularly provides a file synchronization method based on ftp for tens of millions of files.
Background
With the wave of information technology industrial revolution, especially the innovation and application of big data technology, data gradually become the third largest national basic strategic resources and innovative production elements after materials and energy. Data security is becoming increasingly important. With the development of big data technology. Massive data backup and file synchronization become an urgent problem to be solved.
And the data backup is not friendly to the synchronization of a large number of files. The automatic synchronization of files with tens of millions of files is easy to cause synchronization failure due to insufficient memory of a server where a synchronization program is located, and the like, so that the problem needs to be solved urgently.
Disclosure of Invention
The technical task of the invention is to provide a file synchronization method for tens of millions of files based on ftp, which supports automatic synchronization of large-batch files, improves the reliability of large-batch file synchronization, and simultaneously avoids manual synchronization, time and labor waste and easy error.
In order to achieve the purpose, the invention provides the following technical scheme:
a redis library is introduced into a file synchronization program, and a special memory library is adopted to finish file information storage, comparison program and file synchronization program splitting.
Preferably, the file synchronization method based on the ftp ten-million file volumes comprises the following steps:
s1, respectively starting ftp service on the backup server and the file server;
s2, two file monitoring programs, which are respectively connected with the backup server and the file server, scan the file directory and the file information, and store the file directory and the file information to the redis in a circulating traversing manner;
s3, starting a comparison program, and traversing file information on the file server in the redis in batch;
s4, the file synchronization program scans keys in the redis at regular time, and synchronizes the scanned key information with the files;
and S5, after the file synchronization program completes the file synchronization, clearing the key to be synchronized of the file in the redis.
Preferably, in step S1, the backup server is used to store the server from which the file was synchronized.
And respectively starting the ftp server on a backup server (a server used for storing the files synchronized from the server) and a file server.
Preferably, in step S2, a threshold is set during traversal, and when the threshold is reached, useless directory information in the synchronization program is cleared.
Preferably, in step S2, each monitor defines a unique key prefix in redis, distinguishes directory information from different servers, and value is file size and update time.
And the two file monitoring programs are respectively connected with the two servers, scan file directories and file information (change time, size and the like of files), and circularly traverse and store the files to the redis. And setting a threshold value during the passing, and clearing useless directory information in the synchronous program when the threshold value is reached to avoid memory overload. Each watcher defines a unique key prefix in redis to distinguish directory information from different servers, value being file size and update time. For example: the key of the information recorded on the file server in the redis may be defined as file _ server _1_ $ { file absolute path }, and the key of the information recorded on the backup file server may be defined as backup _ server _1_ $ { file absolute path }. After the two monitoring programs are completely operated, the complete file information of the files on the two servers is stored in the redis.
Preferably, in step S3, the comparison with the directory in the configured backup server is performed to determine whether the key of the same absolute path exists, and to determine whether the size and the update time of the file have changed.
Preferably, in step S3, if there is no key or file information change, a new key to be synchronized is generated in the redis.
And starting a comparison program, and traversing file information on the file server in the redis in batch. Comparing with a directory in the configured backup server: firstly, whether keys with the same absolute path exist is compared, and secondly, the size and the updating time of the file are changed. If no key exists or the file information changes, a new key to be synchronized is generated in the redis, such as defined as sync _ file _1_ $ { file absolute path }. And directly deleting the compared file information.
Preferably, in step S4, the scanned key information is subjected to file synchronization based on the configured file server information and backup server information, and the file is transmitted to the backup server.
And the file synchronization program (completes the sending of the file according to the acquired difference file) regularly scans the key of sync _ file _1 in the redis, and synchronizes the scanned key information according to the configured file server information and the backup server information. The file is sent to the backup server.
Compared with the prior art, the file synchronization method based on the ftp and with the million-level file volume has the following outstanding advantages: the file synchronization method of tens of millions of files based on ftp supports automatic synchronization of large-batch files, improves reliability of large-batch file synchronization, avoids the problems of manual synchronization, time and labor waste and high possibility of error, and has good popularization and application values.
Drawings
FIG. 1 is a flow chart of a file synchronization method based on ftp ten million file volumes;
FIG. 2 is a flowchart of a file scanning method for file synchronization based on ftp tens of millions of files.
Detailed Description
The file synchronization method based on ftp ten million file volumes of the present invention will be described in further detail with reference to the accompanying drawings and embodiments.
Examples
As shown in fig. 1 and fig. 2, in the file synchronization method based on ftp for tens of millions of files according to the present invention, a redis library is introduced into a file synchronization program, and a dedicated memory library is used to complete file information storage, a comparison program, and splitting of the file synchronization program. The method specifically comprises the following steps:
and S1, respectively starting the ftp service on the backup server and the file server.
The backup server is used to store the server that syncs the files from the server. And respectively starting the ftp server on a backup server (a server used for storing the files synchronized from the server) and a file server.
And S2, connecting the backup server and the file server respectively, scanning the file directory and the file information, and storing the file directory and the file information in a redis mode in a circulating traversing mode.
And setting a threshold value during the passing, and clearing useless directory information in the synchronous program when the threshold value is reached. Each monitoring program defines a unique key prefix in redis, and distinguishes directory information from different servers, and value is file size and update time.
And the two file monitoring programs are respectively connected with the two servers, scan file directories and file information (change time, size and the like of files), and circularly traverse and store the files to the redis. And setting a threshold value during the passing, and clearing useless directory information in the synchronous program when the threshold value is reached to avoid memory overload. Each watcher defines a unique key prefix in redis to distinguish directory information from different servers, value being file size and update time. For example: the key of the information recorded on the file server in the redis may be defined as file _ server _1_ $ { file absolute path }, and the key of the information recorded on the backup file server may be defined as backup _ server _1_ $ { file absolute path }. After the two monitoring programs are completely operated, the complete file information of the files on the two servers is stored in the redis.
And S3, starting a comparison program, and traversing file information on the file server in the redis in batch.
And comparing with a directory in the configured backup server, wherein the comparison is carried out on whether keys with the same absolute path exist or not, and the comparison is carried out on whether the size and the updating time of the file are changed or not. And if the key or the file information does not change, generating a new key to be synchronized in the redis.
And starting a comparison program, and traversing file information on the file server in the redis in batch. Comparing with a directory in the configured backup server: firstly, whether keys with the same absolute path exist is compared, and secondly, the size and the updating time of the file are changed. If no key exists or the file information changes, a new key to be synchronized is generated in the redis, such as defined as sync _ file _1_ $ { file absolute path }. And directly deleting the compared file information.
S4, the file synchronization program scans keys in the redis at regular time, and carries out file synchronization on the scanned key information.
And synchronizing the scanned key information according to the configured file server information and the backup server information, and sending the file to the backup server.
And the file synchronization program (completes the sending of the file according to the acquired difference file) regularly scans the key of sync _ file _1 in the redis, and synchronizes the scanned key information according to the configured file server information and the backup server information. The file is sent to the backup server.
And S5, after the file synchronization program completes the file synchronization, clearing the key to be synchronized of the file in the redis.
The above-described embodiments are merely preferred embodiments of the present invention, and general changes and substitutions by those skilled in the art within the technical scope of the present invention are included in the protection scope of the present invention.

Claims (8)

1. A file synchronization method of ten-million file volumes based on ftp is characterized in that: the method introduces a redis library into a file synchronization program, and adopts a special memory library to finish file information storage, comparison program and file synchronization program splitting.
2. The ftp-based file synchronization method of ten million file volumes, according to claim 1, wherein: the method comprises the following steps:
s1, respectively starting ftp service on the backup server and the file server;
s2, two file monitoring programs, which are respectively connected with the backup server and the file server, scan the file directory and the file information, and store the file directory and the file information to the redis in a circulating traversing manner;
s3, starting a comparison program, and traversing file information on the file server in the redis in batch;
s4, the file synchronization program scans keys in the redis at regular time, and synchronizes the scanned key information with the files;
and S5, after the file synchronization program completes the file synchronization, clearing the key to be synchronized of the file in the redis.
3. The ftp-based file synchronization method of ten million file volumes, according to claim 2, wherein: in step S1, the backup server is used to store the server that syncs the files from the server.
4. The ftp-based file synchronization method of ten million file volumes, according to claim 3, wherein: in step S2, a threshold is set during traversal, and when the threshold is reached, useless directory information in the synchronization program is cleared.
5. The ftp-based file synchronization method of ten million file volumes, according to claim 4, wherein: in step S2, each watcher defines a unique key prefix in redis, and distinguishes directory information from different servers, where value is file size and update time.
6. The ftp-based file synchronization method of ten million file volumes, according to claim 5, wherein: in step S3, the comparison with the directory in the configured backup server is performed, in which, firstly, whether the key of the same absolute path exists is compared, and secondly, whether the size and the update time of the file are changed is compared.
7. The ftp-based file synchronization method of ten million file volumes, according to claim 6, wherein: in step S3, if there is no key or file information change, a new key to be synchronized is generated in redis.
8. The ftp-based file synchronization method of ten million file volumes, according to claim 7, wherein: in step S4, the scanned key information is subjected to file synchronization according to the configured file server information and backup server information, and the file is sent to the backup server.
CN202111252650.1A 2021-10-27 2021-10-27 File synchronization method of ten-million-level file volume based on ftp Pending CN113946553A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111252650.1A CN113946553A (en) 2021-10-27 2021-10-27 File synchronization method of ten-million-level file volume based on ftp

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111252650.1A CN113946553A (en) 2021-10-27 2021-10-27 File synchronization method of ten-million-level file volume based on ftp

Publications (1)

Publication Number Publication Date
CN113946553A true CN113946553A (en) 2022-01-18

Family

ID=79332672

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111252650.1A Pending CN113946553A (en) 2021-10-27 2021-10-27 File synchronization method of ten-million-level file volume based on ftp

Country Status (1)

Country Link
CN (1) CN113946553A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115203150A (en) * 2022-05-13 2022-10-18 浪潮卓数大数据产业发展有限公司 Bloom filter-based massive file backup data synchronization method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1863050A (en) * 2005-09-15 2006-11-15 上海华为技术有限公司 Method of document synchronization between server and system thereof
US20130212136A1 (en) * 2012-02-15 2013-08-15 Hitachi Solutions, Ltd. File list generation method, system, and program, and file list generation device
CN106959996A (en) * 2017-01-20 2017-07-18 华数传媒网络有限公司 A kind of back-end data synchronous method based on internet television
CN110908830A (en) * 2019-10-18 2020-03-24 上海英方软件股份有限公司 Method for realizing file system to object storage difference comparison and backup through database

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1863050A (en) * 2005-09-15 2006-11-15 上海华为技术有限公司 Method of document synchronization between server and system thereof
US20130212136A1 (en) * 2012-02-15 2013-08-15 Hitachi Solutions, Ltd. File list generation method, system, and program, and file list generation device
CN106959996A (en) * 2017-01-20 2017-07-18 华数传媒网络有限公司 A kind of back-end data synchronous method based on internet television
CN110908830A (en) * 2019-10-18 2020-03-24 上海英方软件股份有限公司 Method for realizing file system to object storage difference comparison and backup through database

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
伍文城,罗洪,邹思轶,肖建: "文件同步理论在电力系统数据备份中的应用", 电力系统自动化, no. 13 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115203150A (en) * 2022-05-13 2022-10-18 浪潮卓数大数据产业发展有限公司 Bloom filter-based massive file backup data synchronization method

Similar Documents

Publication Publication Date Title
CN102968486B (en) A kind of highly reliable file synchronisation method based on change journal
KR100825720B1 (en) File management method in file system and metadata server for the same
CN109522160B (en) Method and system for comparing and backing up file directory by saving file information abstract
EP3917115A1 (en) Data processing method and apparatus, computer device, and storage medium
CN103220358B (en) The online file synchronisation method in a kind of multiple terminals, system, server and terminal unit
US20220004334A1 (en) Data Storage Method, Apparatus and System, and Server, Control Node and Medium
US20130275395A1 (en) Method for Indexed-Field Based Difference Detection and Correction
CN105159795A (en) Data synchronization method, apparatus and system
CN108416040A (en) A kind of database recovering method, device, terminal device and storage medium
CN105376277A (en) Data synchronization method and device
CN113946553A (en) File synchronization method of ten-million-level file volume based on ftp
CN107229540A (en) A kind of database restoring method and system based on time point
CN110471909A (en) A kind of data base management method, device, server and storage medium
CN112925676A (en) Method for realizing recovery of distributed database cluster at any time point based on WAL
CN111209652B (en) Method and device for constructing time sequence of static equipment model of power system
CN112000850A (en) Method, device, system and equipment for data processing
CN110908830A (en) Method for realizing file system to object storage difference comparison and backup through database
CN116383161A (en) File synchronization method, device and medium
CN115576911A (en) Distributed file storage asynchronous remote copying method and device
CN110109934B (en) Database management method, device, server and storage medium
CN110515916B (en) Master-slave distributed file processing method, master node, slave node and system
EP3889753A1 (en) Data migration
CN112416885A (en) Real-time file synchronization method
CN113553488A (en) Method and device for updating index data in search engine, electronic equipment and medium
CN111966635A (en) Method and device for improving file detection speed of distributed storage file system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination