CN105959423A - Method for remotely transmitting large number of small files - Google Patents
Method for remotely transmitting large number of small files Download PDFInfo
- Publication number
- CN105959423A CN105959423A CN201610585749.6A CN201610585749A CN105959423A CN 105959423 A CN105959423 A CN 105959423A CN 201610585749 A CN201610585749 A CN 201610585749A CN 105959423 A CN105959423 A CN 105959423A
- Authority
- CN
- China
- Prior art keywords
- file
- network
- network packet
- read
- filename
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
- H04L67/025—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP] for remote control or remote monitoring of applications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/06—Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer And Data Communications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a method for remotely transmitting a large number of small files. The method remotely transmits a large number of smalsynchronization files located in a first computing device to a second computing device. The method comprises the following steps that the small files located in the first computing device are opened, and contents of the files are read; the read files are added into a network packet; files in the network packet are simultaneously transmitted to the second computing device through a network; the former steps are repeatedly conducted till all the files are completely transmitted. According to the method for remotely transmitting a large number of the small files, a repeated operating cycle is added on the basis of original processes, by means of a new synchronization protocol, contents of the network packet are defined, multiple sets of file operations are contained, network communications needing to be conducted multiple times originally are combined to be completed at a time, and the use ratio of network bandwidth is fully increased; meanwhile, the influence of file IO delay on system performance is eliminated by means of a multi-thread IO multiplexing technology. By means of the method, the remote transmission efficiency of a huge number of small files is significantly improved.
Description
Technical field
The present invention relates to technical field of the computer network, particularly a kind of large amount of small documents remotely passes
Defeated method.
Background technology
Along with the quick growth of unstructured data, computer system is created that the least
File.In order to improve data security row, these files there is also synchronization, backup etc. not
The demand of transmission between same computer system.
The at present transmission of mass small documents mainly has two class methods:
(1) according to the list of small documents, the instrument biographies one by one such as rsync or ftp are used
Defeated file.This method generally cannot effectively utilize the network bandwidth, because network transmission bandwidth
Positive correlation is had, i.e. along with network service bag with the size of the message bag of transmission over networks
Increasing, the network bandwidth also follows increase, if so the transmission of file the most one by one, meeting
Being limited by the network bandwidth, efficiency of transmission is the lowest.
(2) in advance small documents entirety is packaged into a big file, transmits the most again.This
After small documents entirety is packed by mode, the size of the bag that network sends can be increased, thus improve
File transmission efficiency, but this mode needs extra memory space after storing packing
File, cannot use in the case of limited storage space.Additionally the packing and unpacking of file is also
Need waste to calculate resource, also extend whole transmitting procedure.
Summary of the invention
In order to overcome the deficiencies in the prior art, the invention provides a kind of large amount of small documents and remotely pass
Defeated method, is effectively increased the efficiency of transmission of mass small documents.
The technical solution adopted for the present invention to solve the technical problems is:
A kind of method that present invention firstly provides large amount of small documents remote transmission, it will be located in first
The large amount of small documents remote transmission of calculating equipment calculates equipment to second, comprises the steps:
Step S11: open and be positioned at the small documents of the first calculating equipment and read file content;
Step S12: the file of reading is joined network packet;
Step S13: the file in network packet is calculated through network transmission to described second simultaneously and sets
Standby;
Step S14: repeat step S11~S14, until All Files end of transmission.
Preferably, before described step S11, also include: opened file folder, it is judged that file
It is complete whether name reads, if filename reads complete, end operation, if filename does not read
Complete then execution step S11 after reading filename.
Preferably, before described step S13, also include: judge that network packet is the fullest,
If then performing step S13, if otherwise repeating step S11 and step S12 until filling up network
Bag.
Preferably, described step S13 specifically includes: second calculates equipment receives preservation file
Request;Create multiple files corresponding in network packet;The content of write respective file;Operation knot
Really list returns the first calculating equipment.
The present invention also provides for a kind of method of large amount of small documents remote transmission, and it will be located in the first meter
The large amount of small documents remote transmission of calculation equipment calculates equipment to second, comprises the steps:
Step S21: opened file folder, it is judged that it is complete whether filename reads, if filename is read
Take Bi Ze end operation, if filename does not read complete, read filename;
Step S22: distribute some worker threads;
Step S23: perform to operate as follows in units of single thread in each worker thread:
Open and read the file of filename and read file content;The file of reading is joined network
Bag;File in network packet is calculated equipment through network transmission to described second simultaneously.
Step S24: return step S21, until All Files end of transmission.
Preferably, in described step S23, the file in network packet is being transmitted through network simultaneously
Before described second calculating equipment, also include, it is judged that network packet is the fullest, if otherwise weighing
The re-reading operation taking file content and the file of reading joining network packet, until filling up network
Bag.
The positive effect of the present invention: the present invention adds one in original flow process and heavily operates circulation,
By new synchronous protocol, define network packet content, comprise and organize file operation more so that be original
Need network service repeatedly to merge into once, substantially increase network bandwidth utilization factor;Simultaneously
The present invention passes through multi-threaded I/O multiplex technique so that CPU released thread waits IO when
Release and use to other worker threads, thus disappeared except file I/O postpones for systematic function
Impact.The Remote transmission efficiency that the invention enables mass small documents has had and has been obviously improved.
Accompanying drawing explanation
Fig. 1 is the schematic flow sheet of the embodiment of the present invention 1;
Fig. 2 is the schematic flow sheet of the embodiment of the present invention 2;
Fig. 3 is the schematic flow sheet of the embodiment of the present invention 3;
Fig. 4 is the definition figure of embodiment of the present invention small file host-host protocol network packet;
Test knot when Fig. 5 is to use rsync and the inventive method transmission 1,000,000 16K file
Really.
Detailed description of the invention
Below in conjunction with the accompanying drawings to a preferred embodiment of the present invention will be described in detail.
Embodiment 1
With reference to Fig. 1, the embodiment of the present invention 1 provides a kind of method of large amount of small documents remote transmission,
Assume that the file needing to synchronize only has one, be entirely common small documents, here little below
File is primarily referred to as the file less than 256KB, and maximum not can exceed that the file of 512KB, its
In the size of each small documents may be different, it is also possible to the same.By local source server one
All small documents under file are transferred on another remote server, need through following step
Rapid:
Step S01: opened file folder, it is judged that it is complete whether filename reads, if filename is read
Take Bi Ze end operation, if filename does not read complete, after performing after reading filename
Continuous step;
Step S02: open and read the file of filename and read file content;
Step S03: file content is transmitted to remote server through network;
Step S04: repeat step S01~S03, until All Files end of transmission.
Described step S03 specifically includes: remote server receives the request preserving file;Create
Respective file;The content of write respective file;Operating result list returns source server.
In the present embodiment, multiple operations therein need to use file operation or network operation
This I/O operation at a slow speed, cannot continue executing with follow-up step waiting operating result response when
Suddenly, its efficiency of transmission is poor.
Embodiment 2
With reference to Fig. 2, for promoting the whole structure of small documents transmission, the embodiment of the present invention 2 provides
A kind of method of large amount of small documents remote transmission, it is big that it will be located under source server file
Amount small documents remote transmission, to remote server, comprises the steps:
Step S11: open and be positioned at the small documents of source server and read file content;
Step S12: the file of reading is joined network packet;
Step S13: by the file in network packet simultaneously through network transmission extremely described remote service
Device;
Step S14: repeat step S11~S14, until All Files end of transmission.
Before described step S11, also include: opened file folder, it is judged that whether filename is read
Take complete, if filename reads complete, end operation, if filename does not read complete,
Step S11 is performed after reading filename.
Before described step S13, also include: judge that network packet is the fullest, if then holding
Row step S13, if otherwise repeating step S11 and step S12 until filling up network packet.
Described step S13 specifically includes: remote server receives the request preserving file;Create
Multiple files corresponding in network packet;The content of write respective file;Operating result list returns
Source server.
The present embodiment uses dynamic self-adapting group bag bulk transfer mode, described in embodiment 1
Add one in original flow process and heavily operate circulation, by new synchronous protocol, in definition network packet
Hold (as shown in Figure 4), comprise and organize file operation more so that original needs network repeatedly leads to
Letter is merged into once, substantially increases network bandwidth utilization factor.
One typical trivial file transport protocol network packet is as shown in Figure 4.First it is whole synchronization
Software protocol network message controls information, comprises the information such as message header, verification, protocol specification.
Next being exactly the file group information needing transmission, each file comprises metadata and data portion
Point.Wherein metadata comprises the files such as file path, file name, size, owner, time
Attribute.File content is file content to be transmitted.
Judge the standard that network packet is the fullest:
The network condition disposed according to application, network packet can choose a suitable size.Often
After secondary file metadata and data are attached to network packet, all can update network packet long
Degree.Upper once need appended document data before, all can calculate network packet remaining space whether foot
Enough.It is judged as current network bag if deficiency the fullest, enters transmission flow.
Embodiment 3
With reference to Fig. 3, for promoting the whole structure of small documents transmission, the embodiment of the present invention further
The 3 a kind of methods providing large amount of small documents remote transmission, it will be located in source server file
Under large amount of small documents remote transmission to remote server, comprise the steps:
Step S21: opened file folder, it is judged that it is complete whether filename reads, if filename is read
Take Bi Ze end operation, if filename does not read complete, read filename;
Step S22: distribute some worker threads;
Step S23: perform to operate as follows in units of single thread in each worker thread:
Open and read the file of filename and read file content;The file of reading is joined network
Bag;By the file in network packet simultaneously through network transmission extremely described remote server.
Step S24: return step S21, until All Files end of transmission.
In described step S23, by the file in network packet simultaneously through network transmission to the most described the
Before two calculating equipment, also include, it is judged that network packet is the fullest, if otherwise repeating to read literary composition
The file of reading is also joined the operation of network packet, until filling up network packet by part content.
The present embodiment uses multithreading dynamically to distribute I O multiplexing transmission means, in a thread
Portion, need nonetheless remain for waiting I/O latency, by multi-threaded I/O multiplexing skill during being IO
Art, makes so that CPU discharged thread waits IO when to other worker threads
With, thus eliminate file I/O and postpone the impact for systematic function.
By the transmission method of the present invention, mass small documents transmission performance is obviously improved, contrast
Test result as it is shown in figure 5, wherein test environment as follows:
CPU:Intel E5 is to strong series
Internal memory: 16G
Operating system: CentOS6.4
File system: the LeoFS of 12 dish configurations
Network connects: single ten thousand mbit ethernets connect
Above-described only the preferred embodiments of the present invention, be it should be understood that above enforcement
The explanation of example is only intended to help to understand method and the core concept thereof of the present invention, is not used to limit
Determining protection scope of the present invention, that is done within all thought in the present invention and principle any repaiies
Change, equivalent etc., should be included within the scope of the present invention.
Claims (6)
1. a method for large amount of small documents remote transmission, it will be located in the first calculating equipment
Large amount of small documents remote transmission calculates equipment to second, it is characterised in that comprise the steps:
Step S11: open and be positioned at the small documents of the first calculating equipment and read file content;
Step S12: the file of reading is joined network packet;
Step S13: the file in network packet is calculated through network transmission to described second simultaneously and sets
Standby;
Step S14: repeat step S11~S14, until All Files end of transmission.
The method of a kind of large amount of small documents remote transmission the most according to claim 1, its
It is characterised by: before described step S11, also include: opened file folder, it is judged that filename
Whether read complete, if filename reads complete, end operation, if filename has not read
Bi Ze performs step S11 after reading filename.
The method of a kind of large amount of small documents remote transmission the most according to claim 1, its
It is characterised by: before described step S13, also include: judge that network packet is the fullest, if
It is then to perform step S13, if otherwise repeating step S11 and step S12 until filling up network packet.
The method of a kind of large amount of small documents remote transmission the most according to claim 1, its
It is characterised by: described step S13 specifically includes: second calculates equipment receives asking of preservation file
Ask;Create multiple files corresponding in network packet;The content of write respective file;Operating result
List returns the first calculating equipment.
5. a method for large amount of small documents remote transmission, it will be located in the first calculating equipment
Large amount of small documents remote transmission calculates equipment to second, peculiar is, comprises the steps:
Step S21: opened file folder, it is judged that it is complete whether filename reads, if filename is read
Take Bi Ze end operation, if filename does not read complete, read filename;
Step S22: distribute some worker threads;
Step S23: perform to operate as follows in units of single thread in each worker thread:
Open and read the file of filename and read file content;The file of reading is joined network
Bag;File in network packet is calculated equipment through network transmission to described second simultaneously.
Step S24: return step S21, until All Files end of transmission.
The method of a kind of large amount of small documents remote transmission the most according to claim 5, its
It is characterised by: in described step S23, the file in network packet is being transmitted extremely through network simultaneously
Before described second calculating equipment, also include, it is judged that network packet is the fullest, if otherwise repeating
Read file content and the file of reading joined the operation of network packet, until filling up network
Bag.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610585749.6A CN105959423A (en) | 2016-07-22 | 2016-07-22 | Method for remotely transmitting large number of small files |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610585749.6A CN105959423A (en) | 2016-07-22 | 2016-07-22 | Method for remotely transmitting large number of small files |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105959423A true CN105959423A (en) | 2016-09-21 |
Family
ID=56901406
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610585749.6A Pending CN105959423A (en) | 2016-07-22 | 2016-07-22 | Method for remotely transmitting large number of small files |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105959423A (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103678491A (en) * | 2013-11-14 | 2014-03-26 | 东南大学 | Method based on Hadoop small file optimization and reverse index establishment |
CN103701860A (en) * | 2013-12-06 | 2014-04-02 | 北京奇虎科技有限公司 | Network transmission and receiving methods and devices for small files, and network transmission system |
CN105069048A (en) * | 2015-07-23 | 2015-11-18 | 东方网力科技股份有限公司 | Small file storage method, query method and device |
-
2016
- 2016-07-22 CN CN201610585749.6A patent/CN105959423A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103678491A (en) * | 2013-11-14 | 2014-03-26 | 东南大学 | Method based on Hadoop small file optimization and reverse index establishment |
CN103701860A (en) * | 2013-12-06 | 2014-04-02 | 北京奇虎科技有限公司 | Network transmission and receiving methods and devices for small files, and network transmission system |
CN105069048A (en) * | 2015-07-23 | 2015-11-18 | 东方网力科技股份有限公司 | Small file storage method, query method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9336040B2 (en) | Techniques for remapping sessions for a multi-threaded application | |
US9888048B1 (en) | Supporting millions of parallel light weight data streams in a distributed system | |
CN106506587A (en) | A kind of Docker image download methods based on distributed storage | |
CN103973560B (en) | A kind of method and apparatus that link failure processing is stacked in IRF systems | |
WO2016115831A1 (en) | Fault tolerant method, apparatus and system for virtual machine | |
US8533254B1 (en) | Method and system for replicating content over a network | |
CN104219298B (en) | Group system and its method for data backup | |
US20170048352A1 (en) | Computer-readable recording medium, distributed processing method, and distributed processing device | |
CN107689976A (en) | A kind of document transmission method and device | |
CN107360233A (en) | Method, apparatus, equipment and the readable storage medium storing program for executing that file uploads | |
CN111490963A (en) | Data processing method, system, equipment and storage medium based on QUIC protocol stack | |
CN106164888A (en) | The sequencing schemes of network and storage I/O request for minimizing interference between live load free time and live load | |
CN105827678A (en) | High-availability framework based communication method and node | |
CN107632780A (en) | A kind of roll of strip implementation method and its storage architecture based on distributed memory system | |
CN109688606A (en) | Data processing method, device, computer equipment and storage medium | |
CN113938379A (en) | Method for dynamically loading cloud platform log acquisition configuration | |
Anderson et al. | Algorithms for data migration | |
CN103677983A (en) | Scheduling method and device of application | |
CN109939441A (en) | Using discs verifying method and system | |
JP2013543169A (en) | System including middleware machine environment | |
CN105959423A (en) | Method for remotely transmitting large number of small files | |
CN110417860A (en) | File transfer management method, apparatus, equipment and storage medium | |
CN114490458B (en) | Data transmission method, chip, server and storage medium | |
CN102710772B (en) | A kind of mass data communication system based on cloud platform | |
US10178014B2 (en) | File system, control program of file system management device, and method of controlling file system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160921 |