WO2012065520A1 - 文件传输系统及方法 - Google Patents

文件传输系统及方法 Download PDF

Info

Publication number
WO2012065520A1
WO2012065520A1 PCT/CN2011/081949 CN2011081949W WO2012065520A1 WO 2012065520 A1 WO2012065520 A1 WO 2012065520A1 CN 2011081949 W CN2011081949 W CN 2011081949W WO 2012065520 A1 WO2012065520 A1 WO 2012065520A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
event
client
thread
socket
Prior art date
Application number
PCT/CN2011/081949
Other languages
English (en)
French (fr)
Inventor
陈天健
陈勇
蔡泽霖
李宏博
王俊
Original Assignee
深圳华大基因科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大基因科技有限公司 filed Critical 深圳华大基因科技有限公司
Publication of WO2012065520A1 publication Critical patent/WO2012065520A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/542Event management; Broadcasting; Multicasting; Notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • H04L69/165Combined use of TCP and UDP protocols; selection criteria therefor

Definitions

  • the present invention relates to the field of network data transmission technologies, and in particular, to a file transmission system and method.
  • Huada Gene Research Institute has world-class gene sequencing capabilities, generating tens of terabytes of U024GB per day. It places high demands on data transmission. In addition, data synchronization between sub-centres and headquarters distributed throughout China and overseas has placed very high demands on bandwidth utilization.
  • the submarine cable technology has matured over the past 30 years, the signal delay introduced by remote network transmission is still unavoidable.
  • long-distance transmission such as the optical signal from Beijing to New York will introduce at least 60ms delay
  • the traditional TCP protocol seriously deteriorates the performance of the transmission technology due to signal delay.
  • BDP Band Delay Product
  • the TCP protocol becomes inefficient (this is because the AIMD algorithm completely reduces the congestion window of the TCP protocol, but cannot quickly restore the available bandwidth; in theory, Traffic analysis shows that TCP is more vulnerable to packet loss attacks when BDP is increased to a high level, and cannot complete efficient transmission tasks.
  • the traditional TCP-based file transmission system and method will not be able to meet the increasing demand for large data, long-distance, and real-time transmission in terms of speed and reliability, and become a bottleneck restricting the rapid development of network data transmission technology.
  • the present invention aims to solve at least one of the technical problems existing in the prior art.
  • the present invention provides a file transfer system and method, which utilizes a TCP protocol server and a UDT protocol server having corresponding functions, fully utilizes the performance of the network hardware, and realizes low-latency performance of large-data volume long-distance transmission. .
  • the invention provides a file transfer system.
  • the file transfer system includes a TCP server and a UDT server, wherein the TCP server is configured to: implement an asynchronous non-blocking network mode by triggering a network library through a libevent event at a socket layer;
  • the request is encapsulated into an event notification, and then the direct sum of the total number of the session threads is hashed according to the control connection handle of the client, and the event notification is assigned to the task queue of the different session thread; from the different session
  • the event notification is taken out from the thread's task queue, and then the finite state machine is called to process the event notification and obtain the corresponding response code, which is returned to the client through the socket layer; and the request issued on the client needs to be from the disk.
  • the request When reading and writing data, the request is encapsulated into a task queue that is added to the disk processing thread after the notification event, and the disk processing thread reads and writes a large file between the disk and the cache at one time, and then the session thread multiple times from the cache. Invoking data, interacting with the client multiple times through the socket layer; And the UDT server is used to: establish a main thread for monitoring, a processing command thread pool, a read-write file thread pool, and a compressed data thread pool; when the client makes a request, the main thread will allocate the processing command thread pool to the requested task.
  • the file transfer system employs a high speed transfer TCP server and a high speed transfer UDT server, and the file transfer system thus constructed is a high performance file transfer system.
  • the inventors have found that when a high-speed transmission TCP server and a high-speed transmission UDT server are used to form a file transmission system, the file transmission system can efficiently transmit a large amount of data over a long distance, and the transmission speed is very fast, and the transmission reliability is good.
  • high-speed transmission and “high performance” have no specific meaning and are not particularly limited. Those skilled in the art can understand that when the transmission speed of the TCP server and the UDT server is fast, the composition thereof The file transmission system can effectively transmit a large amount of data in a long-distance real-time, thereby solving the problem of high delay and poor reliability when a large amount of data is transmitted over a long distance. Therefore, it can be said that the file transmission system has high performance. .
  • a TCP server includes: a TCP interface module, configured to receive a request sent by a client, and at the socket layer, trigger a network library by a libevent event to implement an asynchronous non-blocking network mode and according to the control
  • the handle of the connection is hashed, the data is assigned to the task queue of different session threads; the session manager is used to retrieve data from the task queue of different session threads, and then call the finite state machine to process the data and obtain Corresponding response code, returning the response code to the client through the socket layer and calling data from the cache multiple times by the session thread, interacting with the client multiple times through the socket layer;
  • Disk manager the disk manager When the request from the client needs to read and write data from the disk, the request is added to the disk processing thread. Task queue, read and write large files between disk and cache at one time.
  • the TCP server of the present invention can be effectively applied to a file transmission system.
  • the UDT server includes: a UDT interface module, configured to receive a request sent by a client, and interact with the client and transmit the compressed data to the client; a session processing module, The session processing module is configured to: establish a main thread for monitoring, a processing command thread pool, a read-write file thread pool, and a compressed data thread pool.
  • the session processing module is configured to: establish a main thread for monitoring, a processing command thread pool, a read-write file thread pool, and a compressed data thread pool.
  • the main thread allocates a processing command to the requested task.
  • An idle thread in the thread pool which is used to process data connections to transfer data.
  • it performs synchronous communication cooperation between the processing command thread, the read-write file thread, and the compressed data thread.
  • the operation is directly processed by the processing command thread; and the disk manager is used to: compress the data to be transmitted, provide two working spaces for storing the pre-compressed data and the compressed data during compression, and decompress Provides a workspace for storing decompressed data.
  • the UDT server of the present invention can be effectively applied to a file transmission system.
  • the TCP server employs a fully asynchronous architectural mode.
  • the UDT server employs a synchronous blocking mode.
  • data of a text type file is compressed using an LZO compression algorithm.
  • the network library is triggered by the libevent event to set a group of events for each of the control connection and the data connection, and each group of 4 events is used to implement control command communication and data transmission with the client.
  • 4 events in each group include: accept client events, read data events, notify events, write data events.
  • the inventors have surprisingly found that with the file transmission system according to the embodiment of the present invention, large-scale data can be transmitted quickly and efficiently, and real-time transmission speed and reliability can be ensured when performing long-distance transmission, compared with the current file.
  • the transmission system shows great advantages.
  • the file transfer method includes a process in which a TCP server accepts a client event, wherein the process further includes: initializing a listening socket, setting a listening mode, and setting the socket to a non-blocking network mode; Initializing the accept event for the listen socket, activating the accept event, and adding the accept event to the libevent event to trigger the network library; when the client makes the request, the TCP server automatically invokes the accept client event; and accepts the client, generates The client socket, and encapsulates the control connection handle with the requested operation type into an event notification, and adds the event notification to the task queue.
  • the file transfer method further includes a process of the TCP server reading a data event from the client, wherein the process further includes: when accepting the client event is invoked, the TCP server accepts the client, and generates the client Socket; initiates a read event for the client socket, activates the read event, and adds a read event to the libevent event to trigger the network library; when the client sends data, the TCP server automatically calls the read event; and through the client The socket reads the data, encapsulates the data into an event notification, and joins the task queue.
  • the file transfer method further includes a flow of a TCP server notification event, wherein the process further comprises: initializing, the TCP server bundles two sockets: a notification socket and a delivery socket; Notifying the socket to initialize the notification event, and activating the notification event; when the TCP server needs to write data, send a notification event to the delivery socket; the TCP server automatically invokes the notification event; and reads the notification from the notification socket Event, inserting the notification event into the task queue that the write event can call, and activating the write event.
  • the file transfer method further includes a process of writing a data event by the TCP server, wherein the process further includes: when accepting the client event is invoked, the TCP server accepts the client, and generates a client socket Initializes the write event for the client socket, but does not activate the write event; when the TCP server needs to write data, it sends the data to the delivery socket; the TCP server automatically calls the notification event; from the notification socket Reading the notification event, inserting the notification event into the task queue that the write event can call, and activating the write event; calling the write event, and writing the data to the client socket; and verifying whether the data in the cache is written End, if not finished, activate the write event again, call the write event, and write the data to the client socket.
  • the file transfer method further includes a process of transferring a file by the UDT server, wherein the process further includes: the UDT server initializing the processing command thread pool, the read-write file thread pool, and the compressed data thread pool, establishing a socket and Listening to the socket; when the client makes a request, assigning a task to the idle command thread; the client sends a command after connecting the socket, instructing the thread to parse the command and notifying the processing command thread; performing file reading and writing and compressing data processing ; and handle the command thread to communicate data with the client.
  • the file transfer method further includes a process of compressing data by the UDT server, wherein the process further includes: Record and save the size of the data before and after compression when compressing data.
  • a file transmission system and method provided by the present invention adopts a UDT protocol as a main network protocol, and uses a TCP protocol as an underlying transmission protocol under a network condition in which a UDT protocol cannot work, and refers to RFC 959.
  • a compatible version of the FTP protocol is implemented as an application layer protocol for data transmission; in addition, the present invention appropriately expands the protocol and adds a function of data compression, thereby realizing low-latency and excellent long-distance transmission of large-scale data. reliability.
  • Figure 1 is a block diagram showing the structure of a high performance file transfer system in accordance with one embodiment of the present invention
  • FIG. 2 is a block diagram showing the structure of a high speed transmission TCP server in a high performance file transmission system according to an embodiment of the present invention
  • Figure 3 is a block diagram showing the structure of a high speed transmission UDT server in a high performance file transfer system in accordance with one embodiment of the present invention
  • FIG. 4 is a block diagram showing the structure of a high performance file transfer system according to an embodiment of the present invention.
  • Figure 5 shows a flow chart of a high performance file transfer method in accordance with one embodiment of the present invention
  • FIG. 6 is a flow chart showing a high performance file transfer method in accordance with one embodiment of the present invention.
  • Figure 7 A flow chart showing a high performance file transfer method in accordance with one embodiment of the present invention.
  • Figure 8 A flow chart showing a high performance file transfer method in accordance with one embodiment of the present invention.
  • Figure 9 shows a flow chart of a high performance file transfer method in accordance with one embodiment of the present invention.
  • Figure 10 A flow chart showing a high performance file transfer method in accordance with one embodiment of the present invention.
  • the present invention provides a high-performance file transmission system and method.
  • a TCP protocol server and a UDT protocol server having corresponding functions, the performance of the network hardware is fully utilized, and a large The low latency performance of data transmission over long distances.
  • the system includes: a high speed transfer TCP server for use in a socket layer, through a libevent event Trigger the network library to implement asynchronous non-blocking network mode to achieve high concurrency requirements; when a client makes a request, encapsulate the request into an event notification, and then hash the total number of session threads according to the client's control connection handle.
  • Assign event notifications to the task queues of different session threads take event notifications from the task queue, then call the finite state machine for processing and obtain the corresponding response code, which is returned to the client through the socket layer; when the session request needs to be
  • the corresponding information is encapsulated into a task queue that is added to the disk processing thread after the notification event, and the disk processing thread reads and writes a large file between the disk and the cache at one time, and then the session thread repeatedly Calling data in the cache, through the socket layer multiple times and the client Interaction; high-speed transfer of UDT server, establishment of the main thread for listening, and three thread pools for processing commands, reading and writing files, and compressing data; when there is a user request, the main thread assigns the task to one of the processing command thread pools.
  • idle thread establishes a new thread to process data connection to transfer data; when reading and writing files, processing data thread to perform synchronous communication cooperation between read-write file thread and compressed data thread; non-file read and write operation It is processed directly by the data connection thread.
  • the high-speed transmission TCP server further includes: a TCP interface module, configured to implement an asynchronous non-blocking network mode by triggering a network library through a libevent event at a socket layer to achieve high Concurrency requirements; hashing according to the handle of the control connection, assigning data to the task queue of different session threads; session manager for extracting data from the task queue, then calling the finite state machine for processing and obtaining the corresponding response code , returned to the client through the socket layer; the data is called from the cache multiple times by the session thread, and interacts with the client multiple times through the socket layer; the disk manager is used when the session request needs to read and write data from the disk , then add the corresponding information to the task queue of the disk processing thread, and make a large file between the disk and the cache at one time. Read and write.
  • a TCP interface module configured to implement an asynchronous non-blocking network mode by triggering a network library through a libevent event at a socket layer to achieve high Concurr
  • the high speed transmission UDT server further includes: a UDT interface module, configured to receive a user request, and interact with the client and transmit the compressed data to the client; the session processing module , the main thread for monitoring, and three thread pools for processing commands, reading and writing files, and compressing data; when there is a user request, the main thread assigns the task to an idle thread in the processing command thread pool, idle thread Create a new thread to process the data connection to transfer data; when reading and writing files, the data thread performs synchronous communication cooperation with the read-write file thread and the compressed data thread; the non-file read and write operations are directly processed by the data connection thread. ; Disk manager, used to compress the data that needs to be transferred; Provide two workspaces for storing pre-compressed data and compressed data during compression, and only provide one workspace when decompressing.
  • a UDT interface module configured to receive a user request, and interact with the client and transmit the compressed data to the client
  • the session processing module the main thread for monitoring, and three thread pools for processing commands, reading and writing
  • the high speed transfer TCP server employs a full asynchronous architecture mode.
  • the high speed transfer UDT server employs a synchronous blocking mode.
  • the text type file is compressed using an LZO compression algorithm.
  • the network library is triggered by the libevent event to set a set of events for each of the control connection and the data connection, and each set of 4 events is used to implement Client communication.
  • the four events of each group include: accepting client events, reading data events, notifying events, writing data events.
  • Another aspect of the present invention provides a high performance file transfer method, which is implemented by using any of the foregoing systems; the method includes a process of transmitting a TCP server to accept a client event at a high speed; and the process further includes: initializing a listen socket Word socket, and set the listening mode; initialize the accept event for the listening socket socket, activate the accept event, and add the libevent event to trigger the network library; when the client initiates the connection request, the high-speed transmission TCP server automatically calls the accept client event function. Accept the client, generate the client socket socket, and encapsulate the control connection handle with the requested operation type into an event notification, and join the task queue.
  • the method further comprises: a process of high speed transmission of a TCP server reading data events from the client; the process further comprising: transmitting the TCP server at a high speed when the client event function is accepted Accept the client, and generate the client socket socket; initialize the read event for the client socket socket, activate the read event, and add the libevent event to trigger the network library; when the client sends the data, the high-speed transmission TCP server automatically calls Read the event function; read the data from the client socket socket, encapsulate it into an event notification and join the task queue.
  • the method further includes a process of transmitting a TCP server notification event at a high speed; the process further includes: at the time of initialization, the high speed transmission TCP server bundles two socket sockets: a notification socket a socket and a socket socket; a notification event is initialized for notifying the socket socket, and a notification event is activated; when the high speed transmission TCP server needs to write data, the encapsulated event notification is sent to the delivery socket socket by the TCP manager;
  • the high-speed transmission TCP server automatically calls the notification event function; reads data from the notification socket socket, inserts it into the task queue that the write event can call, and activates the write event.
  • the method further includes a process of transmitting a data event to the TCP server at a high speed; the process further includes: receiving the client by the high speed transmission TCP server when the client event function is accepted And generating a client socket socket; initializing a write event for the client socket socket, but not activating the read event; when the high speed transfer TCP server needs to write data, sending data through the TCP manager to the transfer socket socket;
  • the high-speed transmission TCP server automatically calls the notification event function; reads the data from the notification socket socket, inserts it into the task queue that the write event can call, and activates the write event; calls the write event function, and writes the data to the client socket. Socket socket; verify that the data in the cache is written, if not finished, activate the write event again, call the write event function, and write the data to the client socket socket
  • the method further includes a process of transmitting the file by the UDT server at a high speed; the process further includes: initializing the thread pool by the high speed transmission UDT server, establishing a socket socket and listening; When the client initiates the connection request, the task is assigned to the idle command thread; the client sends a command after connecting the socket, instructs the thread to parse the command and notifies the data thread; performs file read and write and compressed data processing; and the data thread and the client perform data Communication.
  • the method further includes a process of compressing data by the UDT server at a high speed; the process further includes: recording and saving the size of the data before and after the compression when compressing the data.
  • the high-performance file transmission system and method provided by the invention adopts the UDT protocol as the main network protocol, and uses the TCP protocol as the underlying transmission protocol under the network condition that the UDT protocol cannot work, and implements a compatible version of the FTP protocol with reference to RFC 959.
  • the present invention appropriately expands the protocol and adds the function of data compression, thereby realizing low delay and reliability of large-scale data long-distance transmission.
  • FIG. 1 shows a block diagram of a high performance file transfer system in accordance with one embodiment of the present invention.
  • the high performance file transfer system 100 includes: a high speed transfer TCP server 102 and a high speed transfer UDT server 104;
  • the high-speed transmission TCP server 102 is configured to implement an asynchronous non-blocking network mode by triggering the network library through the libevent event at the socket layer to achieve high concurrency requirements; when a client makes a request, the request is encapsulated into an event notification (
  • the event notification can include: controlling the connection handle, the data connection handle, the requested operation type, the data cache pointer, and the data length, etc., and then directly taking the remainder of the total number of session threads according to the client's control connection handle (an int variable).
  • the event notification is assigned to the task queue of different session threads; the data is retrieved from the task queue, and then the Finite State Machine is called for processing and the corresponding response code is obtained, which is returned to the client through the socket layer.
  • the session request needs to read and write data from the disk
  • the corresponding information is added to the task queue of the disk processing thread, and the disk processing thread reads and writes the large file between the disk and the cache at one time, and then the session thread multiple times. Call data from the cache, through the socket layer Interact with the client.
  • Traditional FTP Server I/O is mainly composed of external communication I/O and internal I/O. It usually does not encounter external I/O bottlenecks, because many internal I/O bottlenecks are the key to the constraints. In response to this bottleneck, we chose the full asynchronous mode for the architecture design of the FTP server. At the same time, by using the memory cache, the number of disk magnetic pin positioning and reading and writing is reduced, and the ratio of disk I/O to network I/O utilization ratio is improved. This moves the bottleneck to network I/O.
  • UDT server 104 establishing a main thread for listening, and three thread pools for processing commands, reading and writing files, and compressing data; when there is a user request, the main thread assigns the task to an idle one in the processing command thread pool.
  • Thread processing idle threads establish a new thread to process data connections to transfer data (idle threads will create a new thread to handle data connections, in fact the data thread is manipulating a UDT socket socket to transfer data);
  • the data thread When writing a file, the data thread performs synchronous communication cooperation with the read-write file thread and the compressed data thread (two buffers can be set between the three threads); the non-file read and write operations are processed directly by the processing command thread. deal with.
  • a high speed transport TCP server employs a fully asynchronous architectural mode (asynchronous) that can initiate multiple non-blocking I/O operations in succession, notifying the application via a message or thread callback function when the operation is complete.
  • a fully asynchronous architectural mode asynchronous
  • This is the most widely used model for high-performance I/O applications. In practice, only one or a few (customized by the number of CPUs) threads are needed to control multiple I/O operations.
  • High-speed transmission uses synchronous blocking mode. Synchronous blocking means that the function does not return without executing or receiving data, and the thread is suspended.
  • a network library is triggered by a libevent event to set a set of events for each of the control connection and the data connection, and each set of 4 events is used to implement Client control command communication and data transfer.
  • 4 events in each group include: accept client events, read data events, notify events, write data events.
  • a text type file is compressed using an LZO compression algorithm. Accelerate transmission by compressing the data that needs to be transmitted.
  • LZO compression algorithm Accelerate transmission by compressing the data that needs to be transmitted.
  • it is necessary to provide two working spaces for storing pre-compressed data and compressed data (the latter is large to prevent data from being incompressible), and only one working space is needed for decompression. Can be decompressed.
  • the high-performance file transmission system triggers a network library through a lightweight event of libevent to implement an asynchronous non-blocking network mode to achieve high concurrency requirements; in a multi-processor environment, multiple threads (multi-thread)
  • the data of the corresponding session can be processed independently, and the data of each session is fixed in one thread processing, so that when multiple threads are running at the same time, the CPU usage can be effectively improved, the performance can be improved, and the timing of the session request can be avoided. problem.
  • the high speed transport TCP server 200 includes: a TCP interface module 202, a session manager 204, and a disk manager 206;
  • the TCP interface module 202 is configured to implement an asynchronous non-blocking network mode by triggering the network library through the libevent event at the socket layer to achieve high concurrency requirements; when a client makes a request, the request is encapsulated into an event notification (the Event notifications can include: controlling connection handles, data connection handles, requested operation types, data cache pointers, and data lengths, etc., and then hashing the total number of session threads based on the client's control connection handle (an int variable) , assign event notifications to task queues of different session threads.
  • the Event notifications can include: controlling connection handles, data connection handles, requested operation types, data cache pointers, and data lengths, etc., and then hashing the total number of session threads based on the client's control connection handle (an int variable) , assign event notifications to task queues of different session threads.
  • the session manager 204 is configured to take an event notification from the task queue, and then call the finite state machine to process and obtain a corresponding response code, which is returned to the client through the socket layer; the session thread repeatedly calls the data from the cache. Interact with the client multiple times through the socket layer.
  • the disk manager 206 is configured to: when the session request needs to read and write data from the disk, encapsulate the corresponding information into a task queue of the disk processing thread after the notification event, and read and write the large file between the disk and the cache at one time. .
  • the high-performance file transmission system triggers the network library through the lightweight event of libevent to achieve high concurrency requirements; in a multi-processor environment, when multiple threads are running simultaneously, the CPU can be effectively used. Rate, improve performance, and avoid timing issues with session requests. Further, when the session request needs to read and write data from the disk, the corresponding information is added to the task queue of the disk processing thread, and the disk processing thread is characterized by a large disk capacity and a slow read/write speed, one time between the disk and the cache. Read and write large files, and then the session thread calls data from the cache multiple times, and interacts with the client multiple times through the socket layer; thereby reducing the number of disk magnetic pin positioning and reading and writing, and improving disk I/O and network. The ratio of I/O utilization.
  • the high speed transmission UDT server 300 includes: a UDT interface module 302, a session processing module 304, and a disk manager 306;
  • the UDT interface module 302 is configured to receive a user request, and interact with the client and transmit the compressed data to the client.
  • the session processing module 304 establishes a main thread for listening, and three thread pools for processing commands, reading and writing files, and compressing data; when a user requests, the main thread assigns the task to an idle thread in the processing command thread pool. Processing, the idle thread establishes a new thread to process the data connection to transfer data; when reading and writing the file, the processing command thread performs synchronous communication cooperation with the read-write file thread and the compressed data thread; the non-file read and write operation is performed by The processing command thread processes the command thread directly.
  • the disk manager 306 is used for compressing data to be transmitted; when the compression is provided, two working spaces for pre-compressed data and compressed data are provided, and only one working space is provided when decompressing.
  • a text type file is compressed using an LZO (Lempel Ziv Oberhumer) compression algorithm.
  • LZO Lempel Ziv Oberhumer
  • the algorithm has a good compression ratio and compression rate for the text file, and with the characteristics of the UDT transmission protocol, the transmission efficiency can be improved.
  • the UDT protocol is mainly used when a small number of bulk sources share rich bandwidth.
  • the most typical example is grid computing built on a fiber-optic WAN.
  • the main goal of UDT is efficiency, fairness and stability.
  • a single or small number of UDT streams should utilize the available bandwidth offered by any high-speed connection, even if the bandwidth varies dramatically.
  • any concurrent streams must share bandwidth fairly, independent of different bandwidths, start events, RTT (round trip time, Round Trip Time). Stability requires that the packet transmission rate should always be available. The bandwidth is fast and congestion collisions must be avoided.
  • the high-performance file transmission system provided by the invention adopts the UDT protocol as the main network protocol, and appropriately expands the protocol, and adds the function of data compression (the data compression algorithm adopts LZO, and the subsequent can also be aimed at the characteristics of massive data such as gene sequencing. Specialized optimization); Utilizing the good flexibility of the UDT protocol server, high-speed transmission is achieved, and the reliability of transmission is guaranteed.
  • the high performance file transfer system 400 includes: a high speed transfer TCP server and a high speed transfer UDT server; wherein the high speed transfer TCP server includes: a TCP interface module 402, a session manager 404, and a disk manager 406;
  • the TCP interface module 402 is configured to implement an asynchronous non-blocking network mode by using a libevent event to trigger a network library at a socket layer to achieve high concurrency requirements; hashing according to a handle of the control connection, and allocating data to different session threads Task queue.
  • the session manager 404 is configured to retrieve data from the task queue, and then call the finite state machine to process and obtain a corresponding response code, which is returned to the client through the socket layer; the session thread calls the data from the cache multiple times, The socket layer interacts with the client multiple times.
  • the disk manager 406 is configured to add the corresponding information to the task of the disk processing thread when the session request needs to read and write data from the disk. Queue, reading and writing large files between disk and cache at one time.
  • the high speed transmission UDT server includes: a UDT interface module 408, a session processing module 410, and a disk manager 412; wherein the UDT interface module 408 is configured to receive user requests, and interact with the client and transmit the compressed data to the client.
  • the session processing module 410 establishes a main thread for listening, and three thread pools for processing commands, reading and writing files, and compressing data; when there is a user request, the main thread assigns the task to an idle thread in the processing command thread pool. Processing, the idle thread establishes a new thread to process the data connection to transfer data; when reading and writing the file, the processing command thread performs synchronous communication cooperation with the read-write file thread and the compressed data thread; the non-file read and write operation is performed by The processing command thread processes the command thread directly.
  • the disk manager 412 is used to compress the data to be transmitted; when the compression is provided, two working spaces for storing the pre-compressed data and the compressed data are provided, and only one working space is provided when decompressing.
  • the high-performance file transmission system selects the full asynchronous mode in the architecture design of the FTP server, and reduces the number of disk magnetic pin positioning and reading and writing by using the memory cache, thereby improving the utilization of disk I/O and network I/O.
  • the ratio of rates thereby shifting the bottleneck to network I/O.
  • the UDT protocol server is used to achieve high speed transmission and ensure the reliability of transmission.
  • Figure 5 shows a flow chart of a high performance file transfer method in accordance with one embodiment of the present invention.
  • the high-performance file transfer method is implemented by using a system of any of the foregoing embodiments; the method includes a process of transmitting a TCP server to a client event at a high speed; the process 500 further includes:
  • Step 502 initializing a listening socket (such as setting a transport layer protocol, binding a port), and setting a listening mode.
  • a listening socket such as setting a transport layer protocol, binding a port
  • Step 504 initializing the accept event for the listening socket, activating the accept event, and adding a libevent event to trigger the network library.
  • Step 506 When the client initiates the connection request, the high speed transmission TCP server automatically invokes the accept client event event.
  • Step 508 Accept the client, generate a client socket socket, and encapsulate the control connection handle and the requested operation type into an event notification, and join the task queue.
  • FIG. 6 shows a flow chart of a high performance file transfer method in accordance with one embodiment of the present invention.
  • the high-performance file transfer method is implemented by using a system of any of the foregoing embodiments; the method includes a process of transmitting a data event from a client by a TCP server at a high speed; the process 600 further includes:
  • Step 602 when accepting a client event event is invoked, the high speed transmission TCP server accepts the client, and generates a client socket socket
  • Step 604 initializing a read event for the client socket, activating the read event, and adding a libevent event to trigger the network library.
  • Step 606 When the client sends data, the high speed transmission TCP server automatically invokes the read event event.
  • Step 608 reading data from the client socket socket (if it is a control connection, the data is a string ending with "/r/n", if it is a data connection, it is an uploaded file data), and encapsulating Event notifications (such as setting control connection handles and data connection handles, types of operations to supplement requests, setting data cache pointers, and data length) are added to the task queue.
  • Event notifications such as setting control connection handles and data connection handles, types of operations to supplement requests, setting data cache pointers, and data length
  • Figure 7 shows a flow chart of a high performance file transfer method in accordance with one embodiment of the present invention.
  • the high-performance file transfer method is implemented by using a system of any of the foregoing embodiments; the method includes a process of transmitting a TCP server notification event at a high speed; the process 700 further includes:
  • Step 702 when initializing, the high speed transmission TCP server bundles two sockets: the socket socket and the socket socket, the TCP_manager (TCP manager) in the TCPjnterface.
  • TCP_manager TCP manager
  • Step 704 initializing a notification event for the notification socket socket, and activating the notification event.
  • Step 706 when the high speed transmission TCP server needs to write data, send the encapsulated event notification to the delivery socket socket through the TCP manager.
  • the TCP Manager is a member of TCPjnterface and its main function is to manage TCP sockets. Since at the time of initialization, the high speed transport TCP server bundles two socket sockets: the socket socket and the socket socket; therefore, the execution notification event needs to be called by the TCP manager.
  • Step 708 the high speed transmission TCP server automatically invokes the notification event event.
  • Step 710 reading data from the notification socket, inserting into the task queue that the write event can call, and activating the write event.
  • Figure 8 shows a flow chart of a high performance file transfer method in accordance with one embodiment of the present invention.
  • the high-performance file transfer method is implemented by using a system of any of the foregoing embodiments; the method includes a process of transmitting a data event by a TCP server at a high speed; the process 800 further includes:
  • Step 802 When the client event event is accepted, the high speed transmission TCP server accepts the client, and generates a client socket. Word socket;
  • Step 804 initializing a write event for the client socket, but not activating the read event
  • Step 806 when the high speed transmission TCP server needs to write data, send data to the delivery socket socket through the TCP manager; Step 808, the high speed transmission TCP server automatically invokes the notification event event;
  • Step 810 reading data from the notification socket socket, inserting into the task queue that can be called by the write event, and activating the write event;
  • Step 812 calling the write event event, and writing the data to the client socket socket;
  • Step 814 it is checked whether the data in the cache is written. If not, go to step 810, activate the write event again, call the write event event, and write the data to the client socket.
  • Figure 9 shows a flow chart of a high performance file transfer method in accordance with one embodiment of the present invention.
  • the high-performance file transfer method is implemented by using a system of any of the foregoing embodiments; the method includes a process of transmitting a file by a UDT server at a high speed; the process 900 further includes:
  • Step 902 the high speed transmission UDT server initializes the thread pool, establishes a socket socket and listens.
  • Step 904 when the client initiates a connection request, assigning a task to the idle command thread.
  • Step 906 After the client connects to the socket, a command is sent to instruct the thread to parse the command and notify the data thread.
  • Step 908 performing file read and write and compressed data processing.
  • Step 910 The data thread performs data communication with the client.
  • Figure 10 shows a flow chart of a high performance file transfer method in accordance with one embodiment of the present invention.
  • the high-performance file transmission method is implemented by using a system of any one of the foregoing embodiments; the method includes a process of compressing data by a UDT server at a high speed; the process 1000 further includes: recording and saving when compressing data The size of the data before and after compression. Then in the decompression process, first extract the information of the data size before and after compression; read the data of the corresponding size after compression, perform decompression, and write back the decompressed data; thereby ensuring the transmission rate and ensuring understanding Compressed data reliability.
  • the high-performance file transfer system undertakes the task of data transmission in the Panda project.
  • the size of the transferred files ranges from a few KB to a dozen GB, and more than a dozen uploading tasks are concurrent.
  • the following is the server log during the project: 2010-08-19 17:31 :33 INFO - [run_finish_download] [1179] 172.30.0.29 download
  • Gigabit bandwidth is used between the server and the client in this environment, and the upper limit of transmission is 128MB.
  • the average transmission rate reaches 110.80MB and the average bandwidth utilization reaches 86.56%.
  • the high-performance file transmission system and method provided by the invention have an average bandwidth usage rate of the server exceeding 80%, and the data transmission is accurate.
  • An embodiment of the high-performance file transmission system and method provided by the present invention by constructing a TCP protocol server and a UDT protocol server having corresponding functions, fully utilizing the performance of the network hardware, and realizing the low-distance transmission of large data volume. Delay performance.
  • An embodiment of the high performance file transmission system and method provided by the present invention adopts the UDT protocol as the main network protocol.
  • the TCP protocol is used as the underlying transport protocol, and the implementation is implemented by referring to RFC 959.
  • An embodiment of the high performance file transfer system and method provided by the present invention triggers a network library through a lightweight event of libevent to implement an asynchronous non-blocking network mode to achieve high concurrency requirements; in a multiprocessor environment Multiple threads (multi-thread) can independently process the data of the corresponding session, and the data of each session is fixed in one thread processing, so that when multiple threads are running at the same time, the CPU usage can be effectively improved and the performance can be improved. , to avoid the timing of session requests. 4.
  • An embodiment of the high performance file transfer system and method provided by the present invention selects a full asynchronous mode in the architecture design of the FTP server, and reduces the number of disk magnetic pin positioning and reading and writing by using the memory cache, thereby improving the disk I/ The ratio of O to network I/O utilization, thereby shifting the bottleneck to network I/O.
  • the file transmission system and method of the present invention can be effectively applied to long-distance real-time transmission of large-scale data with high reliability.
  • the description of the present invention has been presented for purposes of illustration and description. Many modifications and variations will be apparent to those skilled in the art.
  • the functional modules and functional modules described in the present invention are only described in the manner of the present invention. Those skilled in the art can freely change the division manner of the functional modules and their module configurations to achieve the same according to the teachings of the present invention and the needs of practical applications.
  • the embodiments were chosen and described in order to explain the principles of the invention and the embodiments of the invention, .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer And Data Communications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Description

文件传输系统及方法
优先权信息
本申请请求 2010年 11月 19 日向中国国家知识产权局提交的、 专利申请号为 201010551 120.2 的专利申请的优先权和权益, 并且通过参照将其全文并入此处。
技术领域
本发明涉及网络数据传输技术领域, 尤其涉及一种文件传输系统及方法。
背景技术
当前, 华大基因研究院拥有世界级的基因测序能力, 每日生成的数据多达几十 TB U024GB ) ; 对数据的传输提出了较高的要求。 此外, 分布于中国各地和海外的分中心与总部之间的数据同步, 也对带宽利用率提出了非常高的要求。
虽然经过 30余年的发展, 海底光缆技术已经日益成熟, 但是远程网络传输所引入的信号延迟 仍然无法避免。 特别是在远距离传输中 (如光信号从北京传往紐约至少会引入 60ms的延迟) , 传 统的 TCP协议因信号延迟原因而严重恶化了传输技术的性能。 而且随着带宽时延产品(BDP, Band Delay Product ) 的增加, TCP协议开始变得低效 (这是由于 AIMD算法彻底减少了 TCP协议的拥 塞窗口, 但不能快速的恢复可用带宽; 理论上, 流量分析表明 TCP在 BDP增加到很高的时候比较 容易遭受包损失攻击) , 无法完成高效的传输任务。
从而, 传统的基于 TCP协议的文件传输系统及方法在速度和可靠性方面将无法满足日益增长 的大数据量、 远距离、 实时传输需要, 成为制约网络数据传输技术领域快速发展的瓶颈。
因此, 目前的文件传输系统及方法仍有待改进。
发明内容
本发明旨在至少解决现有技术中存在的技术问题之一。 为此, 本发明提供了一种文件传输系统及 方法, 通过构造具有相应功能的 TCP协议服务器和 UDT协议服务器, 充分利用了网络硬件的性能, 实 现了大数据量远距离传输的低延时性能。
根据本发明的一个方面, 本发明提供了一种文件传输系统。 根据本发明的实施例, 该文件传输系统 包括 TCP服务器和 UDT服务器, 其中, TCP服务器用于: 在套接字层通过 libevent事件触发网络库实 现异步非阻塞网络模式; 在客户端发出的请求不需要从磁盘读写数据时, 将请求封装成事件通知, 然后 根据客户端的控制连接句柄对会话线程总数直接取余进行哈希, 将事件通知分配到不同会话线程的任务 队列中; 从该不同会话线程的任务队列中取出事件通知, 然后调用有限状态机对事件通知进行处理并获 得相应的应答码, 将该应答码通过套接字层返回给客户端; 和在客户端发出的请求需要从磁盘读写数据 时, 将所述请求封装成通知事件后加入磁盘处理线程的任务队列, 磁盘处理线程一次性在磁盘和緩存之 间进行大块文件的读写, 再由会话线程多次从緩存中调用数据, 通过套接字层多次和客户端交互; 以及 UDT服务器用于: 建立用于监听的主线程、 处理命令线程池、 读写文件线程池和压缩数据线程池; 当客 户端发出请求时, 主线程将针对请求的任务分给处理命令线程池中的一个空闲线程, 该空闲线程用于处 理数据连接以传输数据; 当读写文件时, 处理命令线程、 读写文件线程和压缩数据线程之间执行同步的 通信协作; 和非文件读写操作则由处理命令线程直接处理。 根据本发明的实施例, 优选地, 文件传输系 统采用高速传输 TCP服务器和高速传输 UDT服务器, 由此构成的文件传输系统为一种高性能的文件传 输系统。 发明人发现, 当采用高速传输 TCP服务器和高速传输 UDT服务器构成文件传输系统时, 该文 件传输系统能够有效地对大数量的数据进行远距离传输, 并且传输速度非常快, 传输可靠性好。 其中, 上述的术语 "高速传输" 及 "高性能" 并没有特定含义, 也不受特别限制, 本领域技术人员可以理解, 其是指当 TCP服务器和 UDT服务器的传输速度较快时, 其构成的文件传输系统可以有效地对大量数据 进行远距离实时传输, 从而解决了以往大量数据远距离传输时, 延时性高、 可靠性差的问题, 由此, 则 可以说该文件传输系统具备高性能。
根据本发明的一个实施例, TCP服务器包括: TCP接口模块, 该 TCP接口模块用于接收客户端发出 的请求并且在套接字层, 通过 libevent事件触发网络库实现异步非阻塞网络模式以及根据控制连接的句 柄进行哈希, 将数据分配给不同会话线程的任务队列; 会话管理器, 该会话管理器用于从不同会话线程 的任务队列中取出数据, 然后调用有限状态机对该数据进行处理并获得相应的应答码, 将应答码通过套 接字层返回给客户端以及由会话线程多次从緩存中调用数据, 通过套接字层多次和客户端交互; 磁盘管 理器, 所述磁盘管理器用于当客户端发出的请求需要从磁盘读写数据时, 将该请求加入磁盘处理线程的 任务队列, 一次性在磁盘和緩存之间进行大块文件的读写。 由此, 本发明的 TCP服务器能够有效地应用 于文件传输系统中。
根据本发明的一些具体示例, UDT服务器包括: UDT接口模块, 该 UDT接口模块用于接收客户端 发出的请求, 以及与客户端交互并向所述客户端传输压缩后的数据; 会话处理模块, 该会话处理模块用 于: 建立用于监听的主线程、 处理命令线程池、 读写文件线程池和压缩数据线程池, 当客户端发出请求 时, 主线程将针对该请求的任务分给处理命令线程池中的一个空闲线程, 该空闲线程用于处理数据连接 以传输数据, 当读写文件时, 处理命令线程、 读写文件线程和压缩数据线程之间执行同步的通信协作, 非文件读写操作则由处理命令线程直接处理; 以及磁盘管理器, 该磁盘管理器用于: 压缩需要传输的数 据, 在压缩的时候提供用于存放压缩前数据和压缩后数据的两块工作空间,和在解压的时候提供用于存 放解压后数据的一块工作空间。 由此, 本发明的 UDT服务器能够有效地应用于文件传输系统中。
根据本发明的一个实施例, TCP服务器采用全异步架构模式。
根据本发明的实施例, UDT服务器采用同步阻塞模式。
根据本发明的一些具体示例, 采用 LZO压缩算法对文本类型文件的数据进行压缩。
根据本发明的一些实施例, 在套接字层, 通过 libevent事件触发网络库为控制连接和数据连接各设 置 1组事件, 每组 4个事件, 用于实现与客户端的控制命令通信和数据传输。 其中, 每组的 4个事件包 括: 接受客户端事件、 读数据事件、 通知事件、 写数据事件。
发明人惊奇地发现, 采用根据本发明实施例的文件传输系统, 能够快速有效地对大规模数据进行传 输, 且当进行远距离传输时仍能保证实时传输速度和可靠性, 相对于目前的文件传输系统表现出极大的 优越性。
根据本发明的另一方面, 本发明还提供了一种文件传输方法, 其采用根据本发明实施例的文件传输 系统来实现。 根据本发明的一些具体示例, 该文件传输方法包括 TCP服务器接受客户端事件的流程, 其 中该流程进一步包括: 初始化监听套接字, 设置监听模式并将该套接字设置为非阻塞网络模式; 为该监 听套接字初始化接受事件, 激活该接受事件, 并将接受事件加入 libevent事件触发网络库中; 当客户端 发出请求时, TCP服务器自动调用接受客户端事件; 以及接受该客户端, 生成客户端套接字, 并将控制 连接句柄与请求的操作类型封装成一个事件通知, 将该事件通知加入任务队列。
根据本发明的实施例, 该文件传输方法还包括 TCP服务器从客户端读数据事件的流程, 其中该流程 进一步包括: 当接受客户端事件被调用时, TCP服务器接受该客户端, 并生成客户端套接字; 为该客户 端套接字初始化读事件, 激活该读事件, 并将读事件加入 libevent事件触发网络库中; 当客户端发送数 据时, TCP服务器自动调用读事件; 以及通过客户端套接字读出数据, 将该数据封装成事件通知后加入 任务队列。
根据本发明的实施例,该文件传输方法还包括 TCP服务器通知事件的流程,其中该流程进一步包括: 初始化时, TCP服务器捆绑两个套接字: 通知套接字和传递套接字; 为该通知套接字初始化通知事件, 并激活该通知事件; 当 TCP服务器需要写数据时, 将通知事件发送到该传递套接字; TCP服务器自动调 用通知事件; 以及从通知套接字里读出通知事件, 将该通知事件插入到写事件能够调用的任务队列中, 并激活该写事件。
根据本发明的实施例, 该文件传输方法还包括 TCP服务器写数据事件的流程, 其中该流程进一步包 括: 当接受客户端事件被调用时, TCP服务器接受该客户端, 并生成客户端套接字; 为该客户端套接字 初始化写事件, 但不激活该写事件; 当 TCP服务器需要写数据时, 将该数据发送到传递套接字; TCP服 务器自动调用通知事件; 从通知套接字里读出通知事件, 将该通知事件插入到写事件能够调用的任务队 列中, 并激活该写事件; 调用写事件, 并将数据写到客户端套接字; 以及检验緩存内的数据是否被写完, 如果没有写完, 则再次激活写事件, 调用写事件, 并将数据写到客户端套接字。
根据本发明的实施例,该文件传输方法还包括 UDT服务器传输文件的流程,其中该流程进一步包括: UDT服务器初始化处理命令线程池、 读写文件线程池和压缩数据线程池, 建立套接字并监听该套接字; 当客户端发出请求时, 分配任务给空闲命令线程; 该客户端连接套接字后发送命令, 指示线程解析该命 令并通知处理命令线程; 执行文件读写和压缩数据处理; 以及处理命令线程与客户端进行数据通信。
根据本发明的实施例,该文件传输方法还包括 UDT服务器压缩数据的流程,其中该流程进一步包括: 在压缩数据时记录并保存压缩前后数据的大小。
发明人发现, 利用上述根据本发明实施例的文件传输方法, 能够有效地对大规模数据进行远距离实 时传输, 且延时性低, 可靠性好, 易于推广。 具体地, 根据本发明的实施例, 本发明提供的文件传输系 统及方法, 采用 UDT协议作为主要的网络协议, 在 UDT协议无法工作的网络条件下, 使用 TCP协议作 为底层传输协议, 参考 RFC 959, 实现一个 FTP协议的兼容版本, 作为数据传输的应用层协议; 此外, 本发明对协议进行适当扩展, 并加入数据压缩的功能, 从而实现了大规模数据远距离传输的低延迟性和 优异的可靠性。
本发明的附加方面和优点将在下面的描述中部分给出, 部分将从下面的描述中变得明显, 或通 过本发明的实践了解到。
附图说明
本发明的上述和 /或附加的方面和优点从结合下面附图对实施例的描述中将变得明显和容易理 解, 其中:
图 1 : 显示了根据本发明一个实施例的高性能文件传输系统的结构示意图;
图 2:显示了根据本发明一个实施例的高性能文件传输系统中高速传输 TCP服务器的结构示意 图;
图 3: 显示了根据本发明一个实施例的高性能文件传输系统中高速传输 UDT服务器的结构示 意图;
图 4: 显示了根据本发明一个实施例的高性能文件传输系统的结构示意图;
图 5: 显示了根据本发明一个实施例的高性能文件传输方法的流程图;
图 6: 显示了根据本发明一个实施例的高性能文件传输方法的流程图;
图 7: 显示了根据本发明一个实施例的高性能文件传输方法的流程图;
图 8: 显示了根据本发明一个实施例的高性能文件传输方法的流程图;
图 9: 显示了根据本发明一个实施例的高性能文件传输方法的流程图; 以及
图 10: 显示了根据本发明一个实施例的高性能文件传输方法的流程图。
发明详细描述
下面详细描述本发明的实施例, 所述实施例的示例在附图中示出, 其中自始至终相同或类似的 标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例 性的, 仅用于解释本发明, 而不能理解为对本发明的限制。
为了解决现有技术问题的至少之一, 本发明提供了一种高性能文件传输系统及方法, 通过构造具有 相应功能的 TCP协议服务器和 UDT协议服务器, 充分利用了网络硬件的性能, 实现了大数据量远距离 传输的低延时性能。
本发明的一个方面提供了一种高性能文件传输系统, 本发明提供的高性能文件传输系统的一个实施 例中, 该系统包括: 高速传输 TCP服务器, 用于在套接字层, 通过 libevent事件触发网络库实现异步非 阻塞网络模式, 以达到高并发的要求; 当有客户端发出请求时, 将请求封装成事件通知, 然后根据客户 端的控制连接句柄对会话线程总数直接取余进行哈希, 将事件通知分配到不同会话线程的任务队列中; 从任务队列中取出事件通知, 然后调用有限状态机进行处理并获得相应的应答码, 通过套接字层返回给 客户端; 当会话请求需要从磁盘读写数据时, 则将相应信息封装成通知事件后加入磁盘处理线程的任务 队列, 磁盘处理线程则一次性在磁盘和緩存之间进行大块文件的读写, 再由会话线程多次从緩存中调用 数据,通过套接字层多次和客户端交互; 高速传输 UDT服务器,建立用于监听的主线程, 以及处理命令、 读写文件和压缩数据的三个线程池; 当有用户请求时, 主线程将任务分给处理命令线程池中的一个空闲 线程处理, 空闲线程建立一个新的线程来处理数据连接以传输数据; 当读写文件时, 处理数据线程与读 写文件线程、 压缩数据线程之间执行同步的通信协作; 非文件读写操作则由数据连接线程直接处理。
本发明提供的高性能文件传输系统的一个实施例中, 高速传输 TCP服务器进一步包括: TCP接口模 块, 用于在套接字层, 通过 libevent事件触发网络库实现异步非阻塞网络模式, 以达到高并发的要求; 根据控制连接的句柄进行哈希, 将数据分配给不同会话线程的任务队列; 会话管理器, 用于从任务队列 中取出数据, 然后调用有限状态机进行处理并获得相应的应答码, 通过套接字层返回给客户端; 由会话 线程多次从緩存中调用数据, 通过套接字层多次和客户端交互; 磁盘管理器, 用于当会话请求需要从磁 盘读写数据时, 则将相应信息加入磁盘处理线程的任务队列, 一次性在磁盘和緩存之间进行大块文件的 读写。
本发明提供的高性能文件传输系统的一个实施例中, 高速传输 UDT服务器进一步包括: UDT接口 模块, 用于接收用户请求, 以及与客户端交互并向客户端传输压缩后的数据; 会话处理模块, 建立用于 监听的主线程, 以及处理命令、 读写文件和压缩数据的三个线程池; 当有用户请求时, 主线程将任务分 给处理命令线程池中的一个空闲线程处理, 空闲线程建立一个新的线程来处理数据连接以传输数据; 当 读写文件时, 数据线程与读写文件线程、 压缩数据线程之间执行同步的通信协作; 非文件读写操作则由 数据连接线程直接处理; 磁盘管理器, 用于压缩需要传输的数据; 在压缩的时候提供用于存放压缩前数 据和压缩后数据的两块工作空间, 在解压的时候仅提供一块工作空间。
本发明提供的高性能文件传输系统的一个实施例中, 高速传输 TCP服务器采用全异步架构模式。 本发明提供的高性能文件传输系统的一个实施例中, 高速传输 UDT服务器采用同步阻塞模式。 本发明提供的高性能文件传输系统的一个实施例中, 采用 LZO压缩算法对文本类型文件进行压缩。 本发明提供的高性能文件传输系统的一个实施例中, 在套接字层, 通过 libevent事件触发网络库为 控制连接和数据连接各设置了 1组事件, 每组 4个事件, 用于实现与客户端的通信。
本发明提供的高性能文件传输系统的一个实施例中, 每组的 4个事件包括: 接受客户端事件、 读数 据事件、 通知事件、 写数据事件。
本发明的另一个方面提供了一种高性能文件传输方法, 该方法采用前述任意一项的系统来实现; 该 方法包括高速传输 TCP服务器接受客户端事件的流程; 流程进一步包括: 初始化监听套接字 socket, 并 设置监听模式; 为监听套接字 socket初始化接受事件, 激活接受事件, 并加入 libevent事件触发网络库 中; 当客户端发起连接请求时, 高速传输 TCP服务器自动调用接受客户端事件函数; 接受客户端, 生成 客户端套接字 socket, 并将控制连接句柄与请求的操作类型封装成一个事件通知, 加入任务队列。
本发明提供的高性能文件传输方法的一个实施例中,该方法还包括高速传输 TCP服务器从客户端读 数据事件的流程; 流程进一步包括: 当接受客户端事件函数被调用时, 高速传输 TCP服务器接受客户端, 并生成客户端套接字 socket; 为客户端套接字 socket初始化读事件, 激活读事件, 并加入 libevent事件触 发网络库中; 当客户端发送数据时, 高速传输 TCP服务器自动调用读事件函数; 从客户端套接字 socket 里读出数据, 封装成事件通知后加入任务队列。
本发明提供的高性能文件传输方法的一个实施例中,该方法还包括高速传输 TCP服务器通知事件的 流程; 流程进一步包括: 初始化时, 高速传输 TCP服务器捆绑两个套接字 socket: 通知套接字 socket和 传递套接字 socket; 为通知套接字 socket初始化通知事件, 并激活通知事件; 当高速传输 TCP服务器需 要写数据时, 通过 TCP管理器发送封装的事件通知到传递套接字 socket; 高速传输 TCP服务器自动调 用通知事件函数; 从通知套接字 socket里读出数据, 插入到写事件能够调用的任务队列中, 并激活写事 件。
本发明提供的高性能文件传输方法的一个实施例中,该方法还包括高速传输 TCP服务器写数据事件 的流程; 流程进一步包括: 当接受客户端事件函数被调用时, 高速传输 TCP服务器接受客户端, 并生成 客户端套接字 socket; 为客户端套接字 socket初始化写事件, 但不激活读事件; 当高速传输 TCP服务器 需要写数据时, 通过 TCP管理器发送数据到传递套接字 socket; 高速传输 TCP服务器自动调用通知事 件函数; 从通知套接字 socket里读出数据, 插入到写事件能够调用的任务队列中, 并激活写事件; 调用 写事件函数, 并将数据写到客户端套接字 socket; 检验緩存内的数据是否被写完, 如果没有写完, 则再 次激活写事件, 调用写事件函数, 并将数据写到客户端套接字 socket
本发明提供的高性能文件传输方法的一个实施例中,该方法还包括高速传输 UDT服务器传输文件的 流程; 流程进一步包括: 高速传输 UDT服务器初始化线程池, 建立套接字 socket并侦听; 当客户端发 起连接请求时, 分配任务给空闲命令线程; 客户端连接套接字 socket后发送命令, 指示线程解析命令并 通知数据线程; 执行文件读写和压缩数据处理; 数据线程与客户端进行数据通信。
本发明提供的高性能文件传输方法的一个实施例中,该方法还包括高速传输 UDT服务器压缩数据的 流程; 流程进一步包括: 在压缩数据时记录并保存压缩前后数据的大小。
本发明提供的高性能文件传输系统及方法, 采用 UDT协议作为主要的网络协议, 在 UDT协议无法 工作的网络条件下, 使用 TCP协议作为底层传输协议, 参考 RFC 959, 实现一个 FTP协议的兼容版本, 作为数据传输的应用层协议; 此外, 本发明对协议进行适当扩展, 并加入数据压缩的功能, 从而实现了 大规模数据远距离传输的低延迟性和可靠性。
需要说明的是, 根据本发明实施例的文件传输系统及方法是本申请的发明人经过艰苦的创造性劳动 和优化工作才完成的。 下面将结合实施例对本发明的实施方案进行详细描述。 本领域技术人员将会理解, 下面的实施例仅 用于说明本发明, 而不应视为限定本发明的范围。 图 1显示了根据本发明一个实施例的高性能文件传输系统的结构示意图。
如图 1所示,高性能文件传输系统 100包括:高速传输 TCP服务器 102和高速传输 UDT服务器 104; 其中
高速传输 TCP服务器 102, 用于在套接字层, 通过 libevent事件触发网络库实现异步非阻塞网络模 式, 以达到高并发的要求; 当有客户端发出请求时, 将请求封装成一个事件通知(该事件通知可以包括: 控制连接句柄、 数据连接句柄、 请求的操作类型、 数据緩存指针和数据长度等), 然后根据客户端的控制 连接句柄(一个 int型变量)对会话线程总数直接取余进行哈希, 将事件通知分配给不同会话线程的任务 队列中; 从任务队列中取出数据, 然后调用有限状态机( Finite State Machine )进行处理并获得相应的应 答码, 通过套接字层返回给客户端; 当会话请求需要从磁盘读写数据时, 则将相应信息加入磁盘处理线 程的任务队列, 磁盘处理线程则一次性在磁盘和緩存之间进行大块文件的读写, 再由会话线程多次从緩 存中调用数据, 通过套接字层多次和客户端交互。
传统 FTP服务器 I/O主要由外部通讯 I/O和内部 I/O组成,通常不会遇到外部 I/O瓶颈问题, 因为很 多时候服务器内部 I/O瓶颈是掣肘的关键。 针对此瓶颈, 我们在 FTP服务器的架构设计上选用全异步模 式, 同时, 通过使用内存緩存, 减少磁盘磁针定位和读写的次数, 提高磁盘 I/O和网络 I/O的利用率之 比, 从而将瓶颈转移到网络 I/O。
高速传输 UDT服务器 104, 建立用于监听的主线程, 以及处理命令、 读写文件和压缩数据的三个线 程池; 当有用户请求时, 主线程将任务分给处理命令线程池中的一个空闲线程处理, 空闲线程建立一个 新的线程来处理数据连接以传输数据 (空闲线程会建立一个新的线程来处理数据连接, 实际上数据线程 是操纵一个 UDT套接字 socket来传输数据); 当读写文件时, 数据线程与读写文件线程、 压缩数据线程 之间执行同步的通信协作(三线程之间可设置两个緩冲区); 非文件读写操作则由处理命令线程处理命令 线程直接处理。
根据本发明一个实施例, 高速传输 TCP服务器采用全异步架构模式(asynchronous ), 这种模式可以 连续发起多个非阻塞 I/O操作, 当操作完成时通过消息或线程回调函数通知应用程序。这是在高性能 I/O 应用场合最广泛的一种模型, 实际开发中只需一个或几个(根据 CPU数量自定义)线程来对多个 I/O操 作控制。 高速传输 UDT服务器采用同步阻塞模式。 同步阻塞是指函数在没有执行完或者接收完数据的情 况下不返回, 线程被挂起。
根据本发明一个实施例, 在高性能文件传输系统中, 在套接字层, 通过 libevent事件触发网络库为 控制连接和数据连接各设置了 1组事件,每组 4个事件,用于实现与客户端的控制命令通信和数据传输。 其中, 每组的 4个事件包括: 接受客户端事件、 读数据事件、 通知事件、 写数据事件。
根据本发明一个实施例, 在高性能文件传输系统中, 采用 LZO压缩算法对文本类型文件进行压缩。 通过压缩需要传输的数据, 从而加速传输。 此外, 在压缩的时候需要提供用于存放压缩前数据和压缩后 数据的两块工作空间(后者较大, 以防止遇到数据无法压缩的情况), 而解压的时候仅需要一块工作空间 即可以完成解压。
本发明提供的高性能文件传输系统, 通过 libevent这一轻量级事件触发网络库, 实现异步非阻塞网 络模式, 以达到高并发的要求; 在多处理器环境中, 多个线程(multi-thread ) 可以独立处理相应会话的 数据, 且每个会话的数据都固定在一个线程处理, 这样在多个线程同时运行时, 既能有效提高 CPU的使 用率, 提高性能, 又避免出现会话请求的时序问题。
图 2显示了根据本发明一个实施例的高性能文件传输系统中高速传输 TCP服务器的结构示意图。 如图 2所示,高速传输 TCP服务器 200包括: TCP接口模块( TCP Interface )202、会话管理器( Session Manager ) 204和磁盘管理器( Disk Manager ) 206; 其中
TCP接口模块 202, 用于在套接字层, 通过 libevent事件触发网络库实现异步非阻塞网络模式, 以达 到高并发的要求; 当有客户端发出请求时, 将请求封装成一个事件通知(该事件通知可以包括: 控制连 接句柄、 数据连接句柄、 请求的操作类型、 数据緩存指针和数据长度等), 然后根据客户端的控制连接句 柄(一个 int型变量)对会话线程总数直接取余进行哈希,将事件通知分配给不同会话线程的任务队列中。
会话管理器 204, 用于从任务队列中取出事件通知, 然后调用有限状态机进行处理并获得相应的应 答码, 通过套接字层返回给客户端; 由会话线程多次从緩存中调用数据, 通过套接字层多次和客户端交 互。
磁盘管理器 206, 用于当会话请求需要从磁盘读写数据时, 则将相应信息封装成通知事件后加入磁 盘处理线程的任务队列, 一次性在磁盘和緩存之间进行大块文件的读写。
本发明提供的高性能文件传输系统, 通过 libevent这一轻量级事件触发网络库, 实现高并发的要求; 在多处理器环境中, 在多个线程同时运行时, 既能有效提高 CPU的使用率, 提高性能, 又避免出现会话 请求的时序问题。 进一步的, 当会话请求需要从磁盘读写数据时, 则将相应信息加入磁盘处理线程的任 务队列, 磁盘处理线程则根据磁盘容量大、 读写速度慢的特点, 一次性在磁盘和緩存之间进行大块文件 的读写, 再由会话线程多次从緩存中调用数据, 通过套接字层多次和客户端交互; 从而减少磁盘磁针定 位和读写的次数, 提高磁盘 I/O和网络 I/O的利用率之比。
图 3显示了根据本发明一个实施例的高性能文件传输系统中高速传输 UDT服务器的结构示意图。 如图 3所示, 高速传输 UDT服务器 300包括: UDT接口模块(UDT Interface ) 302、 会话处理模块 ( Session Processing ) 304和磁盘管理器(Disk Manager ) 306; 其中
UDT接口模块 302, 用于接收用户请求, 以及与客户端交互并向客户端传输压缩后的数据。
会话处理模块 304, 建立用于监听的主线程, 以及处理命令、 读写文件和压缩数据的三个线程池; 当有用户请求时, 主线程将任务分给处理命令线程池中的一个空闲线程处理, 空闲线程建立一个新的线 程来处理数据连接以传输数据; 当读写文件时, 处理命令线程与读写文件线程、 压缩数据线程之间执行 同步的通信协作; 非文件读写操作则由处理命令线程处理命令线程直接处理。
磁盘管理器 306, 用于压缩需要传输的数据; 在压缩的时候提供用于存放压缩前数据和压缩后数据 的两块工作空间, 在解压的时候仅提供一块工作空间。
根据本发明的一个实施例, 在高性能文件传输系统中, 采用 LZO ( Lempel Ziv Oberhumer )压缩算 法对文本类型文件进行压缩。该算法对文本类文件具有不错的压缩比和压缩速率,再配合 UDT传输协议 的特点, 可以达到提高传输效率的目的。
UDT协议主要用在小数量的 bulk源共享富裕带宽的情况下,最典型的例子就是建立在光纤广域网上 的网格计算。 UDT的主要目标是效率、 公平、 稳定。 单个的或少量的 UDT流应该利用任何高速连接提 供的可用带宽, 即使带宽变化的很剧烈。 同时, 任何并发的流必须公平地共享带宽, 不依赖于不同的带 宽瓶劲、 起始事件、 RTT (双向传播时延, Round Trip Time )。 稳定性需要包发送速率应该一直会聚可用 带宽很快, 并且必须避免拥塞碰撞。
本发明提供的高性能文件传输系统, 采用 UDT协议作为主要的网络协议, 并对协议进行适当扩展, 加入数据压缩的功能(数据压缩算法采用 LZO, 后续也可以针对基因测序等海量数据的特点再作专门优 化); 利用了 UDT协议服务器的良好弹性, 实现了高速率传输, 并保证了传输的可靠性。
图 4显示了根据本发明一个实施例的高性能文件传输系统的一个具体实施方式的结构示意图。 如图 4所示, 高性能文件传输系统 400包括: 高速传输 TCP服务器和高速传输 UDT服务器; 其中 高速传输 TCP服务器包括: TCP接口模块 402、 会话管理器 404和磁盘管理器 406; 其中
TCP接口模块 402, 用于在套接字层, 通过 libevent事件触发网络库实现异步非阻塞网络模式, 以达 到高并发的要求; 根据控制连接的句柄进行哈希, 将数据分配给不同会话线程的任务队列。
会话管理器 404, 用于从任务队列中取出数据, 然后调用有限状态机进行处理并获得相应的应答码, 通过套接字层返回给客户端; 由会话线程多次从緩存中调用数据, 通过套接字层多次和客户端交互。
磁盘管理器 406, 用于当会话请求需要从磁盘读写数据时, 则将相应信息加入磁盘处理线程的任务 队列, 一次性在磁盘和緩存之间进行大块文件的读写。
高速传输 UDT服务器包括: UDT接口模块 408、 会话处理模块 410和磁盘管理器 412; 其中 UDT接口模块 408, 用于接收用户请求, 以及与客户端交互并向客户端传输压缩后的数据。
会话处理模块 410, 建立用于监听的主线程, 以及处理命令、 读写文件和压缩数据的三个线程池; 当有用户请求时, 主线程将任务分给处理命令线程池中的一个空闲线程处理, 空闲线程建立一个新的线 程来处理数据连接以传输数据; 当读写文件时, 处理命令线程与读写文件线程、 压缩数据线程之间执行 同步的通信协作; 非文件读写操作则由处理命令线程处理命令线程直接处理。
磁盘管理器 412, 用于压缩需要传输的数据; 在压缩的时候提供用于存放压缩前数据和压缩后数据 的两块工作空间, 在解压的时候仅提供一块工作空间。
本发明提供的高性能文件传输系统, 在 FTP服务器的架构设计上选用全异步模式, 同时通过使用内 存緩存, 减少磁盘磁针定位和读写的次数, 提高磁盘 I/O和网络 I/O的利用率之比, 从而将瓶颈转移到 网络 I/O。 此外, 利用了 UDT协议服务器的良好弹性, 实现了高速率传输, 并保证了传输的可靠性。
图 5显示了根据本发明一个实施例的高性能文件传输方法的流程图。
如图 5所示, 高性能文件传输方法采用前述实施例中任意一种类的系统来实现; 该方法包括高速传 输 TCP服务器接受客户端事件的流程; 该流程 500进一步包括:
步骤 502, 初始化监听套接字 socket (如设置运输层协议、 捆绑端口), 并设置监听模式。
步骤 504,为监听套接字 socket初始化接受事件,激活接受事件,并加入 libevent事件触发网络库中。 步骤 506, 当客户端发起连接请求时, 高速传输 TCP服务器自动调用接受客户端事件事件。
步骤 508 ,接受客户端, 生成客户端套接字 socket, 并将控制连接句柄与请求的操作类型封装成一个 事件通知, 加入任务队列。
图 6显示了根据本发明一个实施例的高性能文件传输方法的流程图。
如图 6所示, 高性能文件传输方法采用前述实施例中任意一种类的系统来实现; 该方法包括高速传 输 TCP服务器从客户端读数据事件的流程; 该流程 600进一步包括:
步骤 602, 当接受客户端事件事件被调用时, 高速传输 TCP服务器接受客户端, 并生成客户端套接 字 socket
步骤 604, 为客户端套接字 socket初始化读事件, 激活读事件, 并加入 libevent事件触发网络库中。 步骤 606, 当客户端发送数据时, 高速传输 TCP服务器自动调用读事件事件。
步骤 608 , 从客户端套接字 socket里读出数据(如果是控制连接, 该数据为一条以 "/r/n" 结尾的字 符串,如果是数据连接,则为上传的文件数据),封装成事件通知(如设置控制连接句柄及数据连接句柄、 补充请求的操作类型、 设置数据緩存指针和数据长度)后加入任务队列。
图 7显示了根据本发明一个实施例的高性能文件传输方法流程图。
如图 7所示, 高性能文件传输方法采用前述实施例中任意一种类的系统来实现; 该方法包括高速传 输 TCP服务器通知事件的流程; 该流程 700进一步包括:
步骤 702, 初始化时, 高速传输 TCP服务器捆绑两个套接字 socket: 通知套接字 socket和传递套接 字 socket, 存在 TCPjnterface内的 TCP_manager ( TCP管理器)。
步骤 704, 为通知套接字 socket初始化通知事件, 并激活通知事件。
步骤 706, 当高速传输 TCP服务器需要写数据时, 通过 TCP管理器发送封装的事件通知到传递套 接字 socket。 具体来说, TCP管理器是 TCPjnterface内部的一个成员, 主要功能是对 TCP的 socket进 行管理。 由于在初始化时, 所述高速传输 TCP服务器捆绑两个套接字 socket: 通知套接字 socket和传递 套接字 socket; 所以, 执行通知事件需要通过 TCP管理器调用。
步骤 708 , 高速传输 TCP服务器自动调用通知事件事件。
步骤 710, 从通知套接字 socket里读出数据, 插入到写事件能够调用的任务队列中, 并激活写事件。 图 8显示了根据本发明一个实施例的高性能文件传输方法的流程图。
如图 8所示, 高性能文件传输方法采用前述实施例中任意一种类的系统来实现; 该方法包括高速传 输 TCP服务器写数据事件的流程; 该流程 800进一步包括:
步骤 802, 当接受客户端事件事件被调用时, 高速传输 TCP服务器接受客户端, 并生成客户端套接 字 socket;
步骤 804, 为客户端套接字 socket初始化写事件, 但不激活读事件;
步骤 806, 当高速传输 TCP服务器需要写数据时, 通过 TCP管理器发送数据到传递套接字 socket; 步骤 808 , 高速传输 TCP服务器自动调用通知事件事件;
步骤 810, 从通知套接字 socket里读出数据, 插入到写事件能够调用的任务队列中, 并激活写事件; 步骤 812, 调用写事件事件, 并将数据写到客户端套接字 socket;
步骤 814, 检验緩存内的数据是否被写完。 如果没有写完, 则跳转到步骤 810, 再次激活写事件, 调 用写事件事件, 并将数据写到客户端套接字 socket
图 9显示了根据本发明一个实施例的高性能文件传输方法的流程图。
如图 9所示, 高性能文件传输方法采用前述实施例中任意一种类的系统来实现; 该方法包括高速传 输 UDT服务器传输文件的流程; 该流程 900进一步包括:
步骤 902, 高速传输 UDT服务器初始化线程池, 建立套接字 socket并侦听。
步骤 904, 当客户端发起连接请求时, 分配任务给空闲命令线程。
步骤 906, 客户端连接套接字 socket后发送命令, 指示线程解析命令并通知数据线程。
步骤 908 , 执行文件读写和压缩数据处理。
步骤 910, 数据线程与客户端进行数据通信。
图 10显示了根据本发明一个实施例的高性能文件传输方法的流程图。
如图 10所示, 高性能文件传输方法采用前述实施例中任意一种类的系统来实现; 该方法包括高速传 输 UDT服务器压缩数据的流程; 该流程 1000进一步包括: 在压缩数据时, 记录并保存压缩前后数据的 大小。 随后在解压缩过程中, 首先提取压缩前后的数据大小的信息; 读取压缩后相应大小的数据, 执行 解压缩, 在写回解压缩后的数据; 从而既保证了传输速率, 同时也保证了解压缩后的数据可靠性。
接下来简要介绍应用本发明提供的高性能文件传输系统及方法的一个具体实施例。
目前, 高性能文件传输系统( Hyper Transfer )在 Panda项目中承担数据传输的任务, 传输文件的大 小从几 KB到十几 GB不等, 十几个上传下栽任务同时并发。 以下是该项目运行期间的服务器曰志: 2010-08-19 17:31 :33 INFO - [run_finish_download] [1179] 172.30.0.29 download
/mnt/soft/soap/input/100000_reads_2.fq: 108.108MB/s
2010-08-19 17:31 :37 INFO - [run_finish_download] [1179] 172.30.0.29 download /mnt/soft/soap/input/cattle_cuted.fa: 112.674MB/s
2010-08-19 17:31 :40 INFO - [run_finish_download] [1179] 172.30.0.29 download /mnt/soft/soap/input/100000_reads_2.fq: 110.129MB/s
2010-08-19 17:31 :44 INFO - [run_finish_download] [1179] 172.30.0.29 download /mnt/soft/soap/input/cattle_cuted.fa: 112.282MB/s
该环境的服务器与客户端之间使用千兆带宽, 传输上限为 128MB 在数据传输期间, 平均传输速 率达到 110.80MB 平均带宽利用率达到 86.56%。 本发明提供的高性能文件传输系统及方法, 服务器的 平均带宽使用率超过 80% , 数据传输准确无误。
参考前述本发明示例性的描述, 本领域技术人员可以清楚的知晓本发明具有以下优点:
1、 本发明提供的高性能文件传输系统及方法的一个实施例, 通过构造具有相应功能的 TCP协议服 务器和 UDT协议服务器, 充分利用了网络硬件的性能, 实现了大数据量远距离传输的低延时性能。
2、 本发明提供的高性能文件传输系统及方法的一个实施例, 采用 UDT协议作为主要的网络协议, 在 UDT协议无法工作的网络条件下, 使用 TCP协议作为底层传输协议, 参考 RFC 959, 实现一个 FTP 协议的兼容版本, 作为数据传输的应用层协议; 此外, 本发明对协议进行适当扩展, 并加入数据压缩的 功能, 从而实现了大规模数据远距离传输的低延迟性和可靠性。
3、 本发明提供的高性能文件传输系统及方法的一个实施例, 通过 libevent这一轻量级事件触发网络 库, 实现异步非阻塞网络模式, 以达到高并发的要求; 在多处理器环境中, 多个线程( multi-thread )可 以独立处理相应会话的数据, 且每个会话的数据都固定在一个线程处理, 这样在多个线程同时运行时, 既能有效提高 CPU的使用率, 提高性能, 又避免出现会话请求的时序问题。 4、 本发明提供的高性能文件传输系统及方法的一个实施例, 在 FTP服务器的架构设计上选用全异 步模式, 同时通过使用内存緩存, 减少磁盘磁针定位和读写的次数, 提高磁盘 I/O和网络 I/O的利用率 之比, 从而将瓶颈转移到网络 I/O。
工业实用性
本发明的文件传输系统及方法, 能够有效地应用于大规模数据的远距离实时传输, 并且可靠性好。 本发明的描述是为了示例和描述起见而给出的,而并不是无遗漏的或者将本发明限于所公开的形式。 很多修改和变化对于本领域的普通技术人员而言是显然的。 本发明中描述的功能模块以及功能模块的划 分方式仅为说明本发明的思想, 本领域技术人员根据本发明的教导以及实际应用的需要可以自由改变功 能模块的划分方式及其模块构造以实现相同的功能; 选择和描述实施例是为了更好说明本发明的原理和 实际应用, 并且使本领域的普通技术人员能够理解本发明从而设计适于特定用途的带有各种修改的各种 实施例。
在本说明书的描述中, 参考术语 "一个实施例"、 "一些实施例"、 "示意性实施例"、 "示例"、 "具体 示例"、 或 "一些示例"等的描述意指结合该实施例或示例描述的具体特征、 结构、 材料或者特点包含于 本发明的至少一个实施例或示例中。 在本说明书中, 对上述术语的示意性表述不一定指的是相同的实施 例或示例。 而且, 描述的具体特征、 结构、 材料或者特点可以在任何的一个或多个实施例或示例中以合 适的方式结合。

Claims

权利要求书
1. 一种文件传输系统, 其特征在于, 所述文件传输系统包括 TCP服务器和 UDT服务器, 其中, 所述 TCP服务器用于:
在套接字层通过 libevent事件触发网络库实现异步非阻塞网络模式;
在客户端发出的请求不需要从磁盘读写数据时, 将所述请求封装成事件通知, 然后根据客户端的控 制连接句柄对会话线程总数直接取余进行哈希, 将所述事件通知分配到不同会话线程的任务队列中; 从所述不同会话线程的任务队列中取出所述事件通知, 然后调用有限状态机对所述事件通知进行处 理并获得相应的应答码, 将所述应答码通过套接字层返回给客户端; 和
在客户端发出的请求需要从磁盘读写数据时, 将所述请求封装成通知事件后加入磁盘处理线程的任 务队列, 磁盘处理线程一次性在磁盘和緩存之间进行大块文件的读写, 再由会话线程多次从緩存中调用 数据, 通过套接字层多次和客户端交互; 以及
所述 UDT服务器用于:
建立用于监听的主线程、 处理命令线程池、 读写文件线程池和压缩数据线程池;
当客户端发出请求时, 所述主线程将针对所述请求的任务分给处理命令线程池中的一个空闲线程, 所述空闲线程用于处理数据连接以传输数据;
当读写文件时, 处理命令线程、 读写文件线程和压缩数据线程之间执行同步的通信协作; 和 非文件读写操作则由处理命令线程直接处理。
2.根据权利要求 1所述的系统, 其特征在于, 所述 TCP服务器包括:
TCP接口模块, 所述 TCP接口模块用于接收客户端发出的请求并且在套接字层, 通过 libevent事件 触发网络库实现异步非阻塞网络模式以及根据控制连接的句柄进行哈希, 将数据分配给不同会话线程的 任务队列;
会话管理器, 所述会话管理器用于从所述不同会话线程的任务队列中取出数据, 然后调用有限状态 机对所述数据进行处理并获得相应的应答码, 将所述应答码通过套接字层返回给客户端以及由会话线程 多次从緩存中调用数据, 通过套接字层多次和客户端交互;
磁盘管理器, 所述磁盘管理器用于当客户端发出的请求需要从磁盘读写数据时, 将所述请求加入磁 盘处理线程的任务队列, 一次性在磁盘和緩存之间进行大块文件的读写。
3.根据权利要求 1所述的系统, 其特征在于, 所述 UDT服务器包括:
UDT接口模块, 所述 UDT接口模块用于接收客户端发出的请求, 以及与客户端交互并向所述客户 端传输压缩后的数据;
会话处理模块, 所述会话处理模块用于:
建立用于监听的主线程、 处理命令线程池、 读写文件线程池和压缩数据线程池,
当客户端发出请求时, 所述主线程将针对所述请求的任务分给处理命令线程池中的一个空闲线程, 所述空闲线程用于处理数据连接以传输数据,
当读写文件时, 处理命令线程、 读写文件线程和压缩数据线程之间执行同步的通信协作, 非文件读写操作则由处理命令线程直接处理; 以及
磁盘管理器, 所述磁盘管理器用于:
压缩需要传输的数据,
在压缩的时候提供用于存放压缩前数据和压缩后数据的两块工作空间, 和
在解压的时候提供用于存放解压后数据的一块工作空间。
4.根据权利要求 1所述的系统, 其特征在于, 所述 TCP服务器采用全异步架构模式。
5.根据权利要求 1所述的系统, 其特征在于, 所述 UDT服务器采用同步阻塞模式。
6.根据权利要求 1或 3所述的系统, 其特征在于, 采用 LZO压缩算法对文本类型文件的数据进行 压缩。
7.根据权利要求 1或 2所述的系统, 其特征在于, 在所述套接字层, 通过所述 libevent事件触发网 络库为控制连接和数据连接各设置 1组事件, 每组 4个事件, 用于实现与客户端的控制命令通信和数据 传输。
8.根据权利要求 7所述的系统, 其特征在于, 每组的 4个事件包括: 接受客户端事件、读数据事件、 通知事件、 写数据事件。
9. 一种文件传输方法, 其特征在于, 采用前述权利要求中任意一项所述的系统来实现, 所述方法包 括所述 TCP服务器接受客户端事件的流程, 其中
所述流程进一步包括:
初始化监听套接字, 设置监听模式并将所述套接字设置为非阻塞网络模式;
为所述监听套接字初始化接受事件, 激活所述接受事件, 并将所述接受事件加入 libevent事件触发 网络库中;
当客户端发出请求时, 所述 TCP服务器自动调用接受客户端事件; 以及
接受所述客户端, 生成客户端套接字, 并将控制连接句柄与请求的操作类型封装成一个事件通知, 将所述事件通知加入任务队列。
10.根据权利要求 9所述的方法, 其特征在于, 所述方法还包括所述 TCP服务器从客户端读数据事 件的流程, 其中
所述流程进一步包括:
当接受客户端事件被调用时, 所述 TCP服务器接受所述客户端, 并生成客户端套接字; 为所述客户端套接字初始化读事件, 激活所述读事件, 并将所述读事件加入 libevent事件触发网络 库中;
当所述客户端发送数据时, 所述 TCP服务器自动调用读事件; 以及
通过所述客户端套接字读出数据, 将所述数据封装成事件通知后加入任务队列。
11.根据权利要求 9所述的方法, 其特征在于, 所述方法还包括所述 TCP服务器通知事件的流程, 其中
所述流程进一步包括:
初始化时, 所述 TCP服务器捆绑两个套接字: 通知套接字和传递套接字;
为所述通知套接字初始化通知事件, 并激活所述通知事件;
当所述 TCP服务器需要写数据时, 将通知事件发送到所述传递套接字;
所述 TCP服务器自动调用通知事件; 以及
从所述通知套接字里读出通知事件, 将所述通知事件插入到写事件能够调用的任务队列中, 并激活 所述写事件。
12.根据权利要求 9所述的方法,其特征在于,所述方法还包括所述 TCP服务器写数据事件的流程, 其中
所述流程进一步包括:
当接受客户端事件被调用时, 所述 TCP服务器接受所述客户端, 并生成客户端套接字; 为所述客户端套接字初始化写事件, 但不激活所述写事件;
当所述 TCP服务器需要写数据时, 将所述数据发送到所述传递套接字;
所述 TCP服务器自动调用通知事件;
从所述通知套接字里读出通知事件, 将所述通知事件插入到写事件能够调用的任务队列中, 并激活 所述写事件;
调用写事件, 并将数据写到客户端套接字; 以及
检验緩存内的数据是否被写完, 如果没有写完, 则再次激活所述写事件, 调用写事件, 并将数据写 到客户端套接字。
13.根据权利要求 9所述的方法, 其特征在于, 所述方法还包括所述 UDT服务器传输文件的流程, 其中
所述流程进一步包括:
所述 UDT服务器初始化处理命令线程池、读写文件线程池和压缩数据线程池,建立套接字并监听所 述套接字; 当客户端发出请求时, 分配任务给空闲命令线程;
所述客户端连接所述套接字后发送命令, 指示线程解析所述命令并通知处理命令线程; 执行文件读写和压缩数据处理; 以及
所述处理命令线程与所述客户端进行数据通信。
14.根据权利要求 9所述的方法, 其特征在于, 所述方法还包括所述 UDT服务器压缩数据的流程, 其中,
所述流程进一步包括: 在压缩数据时记录并保存压缩前后数据的大小。
PCT/CN2011/081949 2010-11-19 2011-11-08 文件传输系统及方法 WO2012065520A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201010551120.2 2010-11-19
CN 201010551120 CN101982955B (zh) 2010-11-19 2010-11-19 高性能文件传输系统及方法

Publications (1)

Publication Number Publication Date
WO2012065520A1 true WO2012065520A1 (zh) 2012-05-24

Family

ID=43619847

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/081949 WO2012065520A1 (zh) 2010-11-19 2011-11-08 文件传输系统及方法

Country Status (2)

Country Link
CN (1) CN101982955B (zh)
WO (1) WO2012065520A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108984285A (zh) * 2018-06-28 2018-12-11 上海数据交易中心有限公司 一种数据碰撞流分析方法及装置、存储介质、终端
CN110851246A (zh) * 2019-09-30 2020-02-28 天阳宏业科技股份有限公司 一种批量任务处理方法、装置、系统及存储介质
CN111488324A (zh) * 2020-04-14 2020-08-04 浪潮商用机器有限公司 一种基于消息中间件的分布式网络文件系统及其工作方法

Families Citing this family (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101982955B (zh) * 2010-11-19 2013-09-04 深圳华大基因科技有限公司 高性能文件传输系统及方法
CN103078883B (zh) * 2011-10-26 2016-06-15 新奥特(北京)视频技术有限公司 基于ftp的异步式文件传输装置
CN103312625B (zh) 2012-03-09 2016-02-03 深圳市腾讯计算机系统有限公司 一种网络通信的方法和系统
CN103095723A (zh) * 2013-02-04 2013-05-08 中国科学院信息工程研究所 一种网络安全监控方法及系统
CN103501245B (zh) * 2013-09-26 2017-02-08 北京搜狐互联网信息服务有限公司 一种网络事件处理方法及装置
CN105491088A (zh) * 2014-09-17 2016-04-13 中兴通讯股份有限公司 一种文件传输方法及装置
CN104243212A (zh) * 2014-09-24 2014-12-24 杭州华三通信技术有限公司 会话维护方法及装置
CN104753956B (zh) * 2015-04-13 2020-06-16 网神信息技术(北京)股份有限公司 一种数据处理方法和装置
CN105429718A (zh) * 2015-10-28 2016-03-23 西安电子科技大学 基于多并发的无线频谱监测方法
CN105407150A (zh) * 2015-10-31 2016-03-16 苏浩强 应用程序远程控制方法
CN107656779A (zh) * 2016-07-25 2018-02-02 武汉票据交易中心有限公司 一种基于事件的流程处理方法及相关系统
CN106775447B (zh) * 2016-11-14 2020-03-27 成都广达新网科技股份有限公司 一种基于异步非阻塞的磁盘文件读写速率控制方法
CN106850740B (zh) * 2016-12-19 2019-07-23 中国科学院信息工程研究所 一种高吞吐数据流处理方法
CN107124461A (zh) * 2017-05-04 2017-09-01 北京奇艺世纪科技有限公司 一种数据传输方法、装置及系统
CN107329838A (zh) * 2017-05-23 2017-11-07 努比亚技术有限公司 一种业务交互方法、终端和计算机可读存储介质
CN107634984B (zh) * 2017-08-07 2020-11-24 国网河南省电力公司 一种基于单向传输通道的文件同步方法
WO2019056203A1 (zh) * 2017-09-20 2019-03-28 深圳市海能通信股份有限公司 一种低延时音视频传输方法、装置及计算机可读存储介质
CN108055255A (zh) * 2017-12-07 2018-05-18 华东师范大学 一种事件库、可扩展数据管理系统及其管理方法
CN108881378A (zh) * 2018-05-02 2018-11-23 象翌微链科技发展有限公司 一种文件的传输方法、系统及设备
CN108647104B (zh) * 2018-05-15 2022-05-31 北京五八信息技术有限公司 请求处理方法、服务器及计算机可读存储介质
CN108848105A (zh) * 2018-06-29 2018-11-20 首钢京唐钢铁联合有限责任公司 建立通讯连接的方法及装置
CN109274643A (zh) * 2018-08-14 2019-01-25 国网甘肃省电力公司电力科学研究院 一种基于libevent架构的新能源厂站发电单元终端接入管理系统
CN109587235A (zh) * 2018-11-30 2019-04-05 深圳市网心科技有限公司 一种基于网络库的数据访问方法、客户端、系统及介质
CN109918209B (zh) * 2019-01-28 2021-02-02 深兰科技(上海)有限公司 一种线程间通信的方法和设备
CN109660562A (zh) * 2019-01-30 2019-04-19 苏州德锐特成像技术有限公司 一种用于大数据同步的系统及客户端
CN111698275B (zh) * 2019-03-15 2021-12-14 华为技术有限公司 数据处理方法、装置及设备
CN110519329B (zh) * 2019-07-23 2022-06-07 苏州浪潮智能科技有限公司 一种并发处理samba协议请求的方法、设备及可读介质
CN112134852B (zh) * 2020-08-31 2021-08-13 广州锦行网络科技有限公司 一种蜜罐系统攻击行为数据异步http发送方法及装置
CN111949607B (zh) * 2020-09-03 2023-06-27 网易(杭州)网络有限公司 一种udt文件的监控方法、系统和装置
CN112069146A (zh) * 2020-09-08 2020-12-11 北京同有飞骥科技股份有限公司 提高基于zfs文件系统异步远程复制的方法及系统
CN112732657A (zh) * 2020-12-30 2021-04-30 广州金越软件技术有限公司 一种在ftp服务场景下高效读取大量小文件的方法
CN112995198B (zh) * 2021-03-29 2023-04-28 中信银行股份有限公司 一种基于socket的短连接通信方法和装置
CN113992651B (zh) * 2021-09-24 2024-05-14 深圳市有方科技股份有限公司 一种基于文件传输协议ftp的下载方法和相关产品
CN114338647B (zh) * 2021-12-16 2024-06-14 中孚安全技术有限公司 一种基于国产操作系统的轻量级文件传输方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1519743A (zh) * 2003-01-23 2004-08-11 英业达股份有限公司 利用广播机制均衡负载的文件预装系统及其方法
US20090216880A1 (en) * 2008-02-26 2009-08-27 Viasat, Inc. Methods and Systems for Dynamic Transport Selection Based on Last Mile Network Detection
CN101894066A (zh) * 2010-04-28 2010-11-24 北京同有飞骥科技有限公司 一种基于磁盘阵列虚拟化的网络存储管理软件测试方法
CN101982955A (zh) * 2010-11-19 2011-03-02 深圳华大基因科技有限公司 高性能文件传输系统及方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1519743A (zh) * 2003-01-23 2004-08-11 英业达股份有限公司 利用广播机制均衡负载的文件预装系统及其方法
US20090216880A1 (en) * 2008-02-26 2009-08-27 Viasat, Inc. Methods and Systems for Dynamic Transport Selection Based on Last Mile Network Detection
CN101894066A (zh) * 2010-04-28 2010-11-24 北京同有飞骥科技有限公司 一种基于磁盘阵列虚拟化的网络存储管理软件测试方法
CN101982955A (zh) * 2010-11-19 2011-03-02 深圳华大基因科技有限公司 高性能文件传输系统及方法

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108984285A (zh) * 2018-06-28 2018-12-11 上海数据交易中心有限公司 一种数据碰撞流分析方法及装置、存储介质、终端
CN108984285B (zh) * 2018-06-28 2021-10-15 上海数据交易中心有限公司 一种数据碰撞流分析方法及装置、存储介质、终端
CN110851246A (zh) * 2019-09-30 2020-02-28 天阳宏业科技股份有限公司 一种批量任务处理方法、装置、系统及存储介质
CN111488324A (zh) * 2020-04-14 2020-08-04 浪潮商用机器有限公司 一种基于消息中间件的分布式网络文件系统及其工作方法
CN111488324B (zh) * 2020-04-14 2023-12-29 浪潮商用机器有限公司 一种基于消息中间件的分布式网络文件系统及其工作方法

Also Published As

Publication number Publication date
CN101982955A (zh) 2011-03-02
CN101982955B (zh) 2013-09-04

Similar Documents

Publication Publication Date Title
WO2012065520A1 (zh) 文件传输系统及方法
Lu et al. Accelerating spark with rdma for big data processing: Early experiences
US10069766B2 (en) Accelerated data transfer using thread pool for parallel operations
US9575689B2 (en) Data storage system having segregated control plane and/or segregated data plane architecture
Islam et al. High performance RDMA-based design of HDFS over InfiniBand
WO2017114091A1 (zh) 一种nas数据访问的方法、系统及相关设备
US7415470B2 (en) Capturing and re-creating the state of a queue when migrating a session
US20020026502A1 (en) Network server card and method for handling requests received via a network interface
CN111966446B (zh) 一种容器环境下rdma虚拟化方法
KR20040035723A (ko) 컴퓨터 네트워크 내의 블록 데이터 저장
WO2023155526A1 (zh) 一种数据流处理方法、存储控制节点及非易失性可读存储介质
CN109547162B (zh) 基于两套单向边界的数据通信方法
Ren et al. Protocols for wide-area data-intensive applications: Design and performance issues
WO2022218160A1 (zh) 一种数据访问系统、方法、设备以及网卡
WO2017028399A1 (zh) 通信数据传输方法及系统
WO2023046141A1 (zh) 一种数据库网络负载性能的加速框架、加速方法及设备
CN102546612A (zh) 用户态下基于rdma协议的远程过程调用实现方法
WO2023169267A1 (zh) 一种基于网络设备的数据处理方法及网络设备
WO2023246843A1 (zh) 数据处理方法、装置及系统
CN103577245A (zh) 一种轻量级虚拟机迁移方法
WO2023000770A1 (zh) 一种处理访问请求的方法、装置、存储设备及存储介质
CN103338156B (zh) 一种基于线程池的命名管道服务器并发通信方法
CN115202573A (zh) 数据存储系统以及方法
WO2024125106A1 (zh) 数据传输方法、装置、设备及存储介质
Manohar et al. Progressive vector quantization of multispectral image data using a massively parallel SIMD machine

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11841415

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11841415

Country of ref document: EP

Kind code of ref document: A1