CN113946553A

CN113946553A - File synchronization method of ten-million-level file volume based on ftp

Info

Publication number: CN113946553A
Application number: CN202111252650.1A
Authority: CN
Inventors: 王玉伟; 单震
Original assignee: Chaozhou Zhuoshu Big Data Industry Development Co Ltd
Current assignee: Chaozhou Zhuoshu Big Data Industry Development Co Ltd
Priority date: 2021-10-27
Filing date: 2021-10-27
Publication date: 2022-01-18

Abstract

The invention discloses a file synchronization method of ten-million-level file volume based on ftp, and belongs to the technical field of computer and file synchronization. The file synchronization method of tens of millions of files based on ftp introduces a redis library into a file synchronization program, and adopts a special memory library to finish file information storage, comparison program and file synchronization program splitting. The file synchronization method of tens of millions of files based on ftp supports automatic synchronization of large-batch files, improves the reliability of large-batch file synchronization, avoids manual synchronization, wastes time and labor, is easy to make mistakes, and has good popularization and application values.

Description

File synchronization method of ten-million-level file volume based on ftp

Technical Field

The invention relates to the technical field of computer and file synchronization, and particularly provides a file synchronization method based on ftp for tens of millions of files.

Background

With the wave of information technology industrial revolution, especially the innovation and application of big data technology, data gradually become the third largest national basic strategic resources and innovative production elements after materials and energy. Data security is becoming increasingly important. With the development of big data technology. Massive data backup and file synchronization become an urgent problem to be solved.

And the data backup is not friendly to the synchronization of a large number of files. The automatic synchronization of files with tens of millions of files is easy to cause synchronization failure due to insufficient memory of a server where a synchronization program is located, and the like, so that the problem needs to be solved urgently.

Disclosure of Invention

The technical task of the invention is to provide a file synchronization method for tens of millions of files based on ftp, which supports automatic synchronization of large-batch files, improves the reliability of large-batch file synchronization, and simultaneously avoids manual synchronization, time and labor waste and easy error.

In order to achieve the purpose, the invention provides the following technical scheme:

a redis library is introduced into a file synchronization program, and a special memory library is adopted to finish file information storage, comparison program and file synchronization program splitting.

Preferably, the file synchronization method based on the ftp ten-million file volumes comprises the following steps:

s1, respectively starting ftp service on the backup server and the file server;

s2, two file monitoring programs, which are respectively connected with the backup server and the file server, scan the file directory and the file information, and store the file directory and the file information to the redis in a circulating traversing manner;

s3, starting a comparison program, and traversing file information on the file server in the redis in batch;

s4, the file synchronization program scans keys in the redis at regular time, and synchronizes the scanned key information with the files;

and S5, after the file synchronization program completes the file synchronization, clearing the key to be synchronized of the file in the redis.

Preferably, in step S1, the backup server is used to store the server from which the file was synchronized.

And respectively starting the ftp server on a backup server (a server used for storing the files synchronized from the server) and a file server.

Preferably, in step S2, a threshold is set during traversal, and when the threshold is reached, useless directory information in the synchronization program is cleared.

Preferably, in step S2, each monitor defines a unique key prefix in redis, distinguishes directory information from different servers, and value is file size and update time.

And the two file monitoring programs are respectively connected with the two servers, scan file directories and file information (change time, size and the like of files), and circularly traverse and store the files to the redis. And setting a threshold value during the passing, and clearing useless directory information in the synchronous program when the threshold value is reached to avoid memory overload. Each watcher defines a unique key prefix in redis to distinguish directory information from different servers, value being file size and update time. For example: the key of the information recorded on the file server in the redis may be defined as file _ server _1_ $ { file absolute path }, and the key of the information recorded on the backup file server may be defined as backup _ server _1_ $ { file absolute path }. After the two monitoring programs are completely operated, the complete file information of the files on the two servers is stored in the redis.

Preferably, in step S3, the comparison with the directory in the configured backup server is performed to determine whether the key of the same absolute path exists, and to determine whether the size and the update time of the file have changed.

Preferably, in step S3, if there is no key or file information change, a new key to be synchronized is generated in the redis.

And starting a comparison program, and traversing file information on the file server in the redis in batch. Comparing with a directory in the configured backup server: firstly, whether keys with the same absolute path exist is compared, and secondly, the size and the updating time of the file are changed. If no key exists or the file information changes, a new key to be synchronized is generated in the redis, such as defined as sync _ file _1_ $ { file absolute path }. And directly deleting the compared file information.

Preferably, in step S4, the scanned key information is subjected to file synchronization based on the configured file server information and backup server information, and the file is transmitted to the backup server.

And the file synchronization program (completes the sending of the file according to the acquired difference file) regularly scans the key of sync _ file _1 in the redis, and synchronizes the scanned key information according to the configured file server information and the backup server information. The file is sent to the backup server.

Compared with the prior art, the file synchronization method based on the ftp and with the million-level file volume has the following outstanding advantages: the file synchronization method of tens of millions of files based on ftp supports automatic synchronization of large-batch files, improves reliability of large-batch file synchronization, avoids the problems of manual synchronization, time and labor waste and high possibility of error, and has good popularization and application values.

Drawings

FIG. 1 is a flow chart of a file synchronization method based on ftp ten million file volumes;

FIG. 2 is a flowchart of a file scanning method for file synchronization based on ftp tens of millions of files.

Detailed Description

The file synchronization method based on ftp ten million file volumes of the present invention will be described in further detail with reference to the accompanying drawings and embodiments.

Examples

As shown in fig. 1 and fig. 2, in the file synchronization method based on ftp for tens of millions of files according to the present invention, a redis library is introduced into a file synchronization program, and a dedicated memory library is used to complete file information storage, a comparison program, and splitting of the file synchronization program. The method specifically comprises the following steps:

and S1, respectively starting the ftp service on the backup server and the file server.

The backup server is used to store the server that syncs the files from the server. And respectively starting the ftp server on a backup server (a server used for storing the files synchronized from the server) and a file server.

And S2, connecting the backup server and the file server respectively, scanning the file directory and the file information, and storing the file directory and the file information in a redis mode in a circulating traversing mode.

And setting a threshold value during the passing, and clearing useless directory information in the synchronous program when the threshold value is reached. Each monitoring program defines a unique key prefix in redis, and distinguishes directory information from different servers, and value is file size and update time.

And S3, starting a comparison program, and traversing file information on the file server in the redis in batch.

And comparing with a directory in the configured backup server, wherein the comparison is carried out on whether keys with the same absolute path exist or not, and the comparison is carried out on whether the size and the updating time of the file are changed or not. And if the key or the file information does not change, generating a new key to be synchronized in the redis.

S4, the file synchronization program scans keys in the redis at regular time, and carries out file synchronization on the scanned key information.

And synchronizing the scanned key information according to the configured file server information and the backup server information, and sending the file to the backup server.

The above-described embodiments are merely preferred embodiments of the present invention, and general changes and substitutions by those skilled in the art within the technical scope of the present invention are included in the protection scope of the present invention.

Claims

1. A file synchronization method of ten-million file volumes based on ftp is characterized in that: the method introduces a redis library into a file synchronization program, and adopts a special memory library to finish file information storage, comparison program and file synchronization program splitting.

2. The ftp-based file synchronization method of ten million file volumes, according to claim 1, wherein: the method comprises the following steps:

s1, respectively starting ftp service on the backup server and the file server;

3. The ftp-based file synchronization method of ten million file volumes, according to claim 2, wherein: in step S1, the backup server is used to store the server that syncs the files from the server.

4. The ftp-based file synchronization method of ten million file volumes, according to claim 3, wherein: in step S2, a threshold is set during traversal, and when the threshold is reached, useless directory information in the synchronization program is cleared.

5. The ftp-based file synchronization method of ten million file volumes, according to claim 4, wherein: in step S2, each watcher defines a unique key prefix in redis, and distinguishes directory information from different servers, where value is file size and update time.

6. The ftp-based file synchronization method of ten million file volumes, according to claim 5, wherein: in step S3, the comparison with the directory in the configured backup server is performed, in which, firstly, whether the key of the same absolute path exists is compared, and secondly, whether the size and the update time of the file are changed is compared.

7. The ftp-based file synchronization method of ten million file volumes, according to claim 6, wherein: in step S3, if there is no key or file information change, a new key to be synchronized is generated in redis.

8. The ftp-based file synchronization method of ten million file volumes, according to claim 7, wherein: in step S4, the scanned key information is subjected to file synchronization according to the configured file server information and backup server information, and the file is sent to the backup server.