WO2016165482A1 - File transfer method and apparatus - Google Patents

File transfer method and apparatus Download PDF

Info

Publication number
WO2016165482A1
WO2016165482A1 PCT/CN2016/074278 CN2016074278W WO2016165482A1 WO 2016165482 A1 WO2016165482 A1 WO 2016165482A1 CN 2016074278 W CN2016074278 W CN 2016074278W WO 2016165482 A1 WO2016165482 A1 WO 2016165482A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
path
transfer
transmission
address
Prior art date
Application number
PCT/CN2016/074278
Other languages
French (fr)
Chinese (zh)
Inventor
吴孝鹏
尤元建
黄增建
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2016165482A1 publication Critical patent/WO2016165482A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]

Definitions

  • This document relates to, but is not limited to, the field of data storage technology, and more particularly to a file transmission method and apparatus.
  • the Internet of Things developed on the basis of the Internet has been widely used in information exchange and data transmission.
  • the distributed system infrastructure developed by the Apache Foundation is generally used.
  • the Hadoop system stores massive amounts of data.
  • the Hadoop system includes a distributed file system (Hadoop Distributed File System, HDFS) and a distributed database (Hbase).
  • the data storage method in the Hadoop system storage includes structured data files or files stored in HDFS in file storage mode. Structured data files, for unstructured column mode data, can be queried by Hbase; Hadoop system is a platform for massive data storage. Users can use the command line to upload or download files through terminal devices, for example, through PUT command.
  • the files stored in the terminal device are uploaded to the HDFS of the Hadoop system.
  • the files stored in the HDFS of the Hadoop system are downloaded to the memory of the terminal device or the relational database of the Hadoop system in the peer database by using the GET command.
  • Each execution of the command line can only support the transmission of a single file, and the data format of each transmitted file is fixed. If you need to transfer multiple files and files of different data formats, you need to manually enter the command line multiple times. The way can be achieved.
  • each command line can only support the transmission of a single file, and the data format of each transmitted file is fixed, resulting in a file.
  • the operation of the transmission is complicated and the efficiency is low.
  • Embodiments of the present invention provide a file transmission method and apparatus, which can be avoided in dealing with Hadoop.
  • the file transfer operation mode is complicated and the efficiency is low.
  • an embodiment of the present invention provides a file transmission method, including:
  • the transmission rule includes a file format, where a file that meets the transmission rule is searched for in a source address of the file transmission path, and the file that is found is found Transfer to the target address, including:
  • the file transmission path includes a first path and a second path, where the first path and the second path are both included At least one of the file transmission paths, the source address of the file transmission path is searched for a file matching the file format, and a file transfer list is generated; and the source address is sequentially selected according to the file transfer list The corresponding file is transferred to the target address, including:
  • the configuration file further includes a verification mode, where the transmission rule includes an index verification file, and the searching in the source address of the file transmission path is consistent with the Transfer the file of the rule and transfer the found file to the target address, including:
  • the verification mode is an open check, searching for a file corresponding to an index entry of the index check file in a source address of the file transfer path, and transmitting the found file to the target address in;
  • the method further includes:
  • the index verification file is stored in a preset path of the network end to which the target address belongs;
  • the transmission rule further includes a file format, where the searching in the source address of the file transmission path matches the transmission rule The file and transfer the found file to the target address, including:
  • the verification mode is a shutdown check, searching for a file matching the file format in a source address of the file transmission path, generating a file transfer list, and sequentially, according to the file transfer list, the source address The corresponding file in the file is transferred to the target address.
  • the configuration file further includes a compressed mode, where the Transfer the file to the target address, including:
  • the searched file is transferred to the target address when the compressed mode is off.
  • the configuration file further includes a backup mode and a backup address, where After the found file is transferred to the target address, it also includes:
  • the file that has been transferred to the target address in the source address of the file transmission path is backed up to the backup address of the network end to which the source address belongs.
  • the configuration file further includes a scanning time interval, where After searching for the file that meets the transmission rule in the source address of the file transmission path, and transferring the found file to the target address, the method further includes:
  • an embodiment of the present invention provides a file transmission apparatus, including:
  • a configuration module configured to generate a configuration file according to a user input, where the configuration file includes a file transmission path and a transmission rule corresponding to the file transmission path, where the file transmission path includes a source address and a destination address of the file to be transmitted ;
  • a transmission module configured to search for a file that meets the transmission rule in a source address of the file transmission path according to a configuration file generated by the configuration module, and transmit the found file to the target address, where
  • the transferred files include formatted data files and/or unformatted data files.
  • the transmission rule includes a file format
  • the transmission module includes: a searching unit configured to search for a source format of the file transmission path and the file format Match the files and generate a file transfer list;
  • a transmitting unit configured to sequentially transmit the corresponding file in the source address to the target address according to the file transfer list generated by the searching unit.
  • the file transmission path includes a first path and a second path, where the first path and the second path are both included At least one of the file transmission paths, the searching unit is configured to sequentially search for a file matching the file format corresponding to the current file transmission path in the source addresses of the first path and the second path, and generate a current file File transfer list;
  • the transmitting unit is configured to sequentially transmit the corresponding file in the source address of the current file transmission path to the target address according to the current file transmission list generated by the searching unit.
  • the configuration file further includes a verification mode
  • the transmission rule includes an index verification file
  • the transmission module includes: a determining unit, configured to determine the verification State of the mode
  • the searching unit is further configured to: when the determining unit determines that the verification mode is the on check, searching for a file corresponding to the index entry of the index check file in the source address of the file transfer path;
  • a transmission unit configured to transmit the file found by the searching unit to the target address
  • the file transfer module further includes: a processing unit configured to: after the file unit transmits the file to the target address, when the determining unit determines that the check mode is open file processing, the index check file is Stored in a preset path to the network end to which the target address belongs;
  • the determining unit determines that the verification mode is closed file processing, deleting the index verification file or storing the index verification file in a preset path of the network end of the source address.
  • the transmission rule further includes a file format
  • the searching unit is further configured to determine, at the determining unit, the school When the verification mode is off verification, a file matching the file format is searched for in the source address of the file transmission path to generate a file transfer list.
  • the configuration file further includes a compressed mode
  • the transmission module includes: a determining unit, a compressing unit, and a transmitting unit; the determining unit is configured to determine a state of the compressed mode;
  • the compression unit is configured to perform compression processing on the found file when the determining unit determines that the compression mode is on;
  • the transmission module is configured to transmit the compressed file of the compression unit to the target address; and when the determining unit determines that the compression mode is off, transmit the found file to In the target address.
  • the configuration file further includes a backup mode and a backup address
  • the transmission module further includes: a determining unit and a processing unit; the determining unit is configured to be in the transmitting unit After the found file is transferred to the target address, the state of the backup mode is determined;
  • the processing unit is configured to, when the determining unit determines that the backup mode is off, delete a file that has been transmitted to the target address in a source address of the file transmission path;
  • the determining unit determines that the backup mode is on, back up the file that has been transmitted to the target address in the source address of the file transmission path to the backup address of the network end to which the source address belongs.
  • the configuration file further includes a scanning time interval
  • the file transmission device The method further includes: a timing module configured to perform a timing operation after the transmission module transmits the found file to the target address;
  • the transmission module is further configured to: when the timing time of the timing module reaches the scanning time interval, searching for the current configuration file according to the current configuration file in the source address of the file transmission path of the current configuration file. Transfer the file of the rule and transfer the found file to the destination address of the file transfer path of the current profile.
  • the embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the above method.
  • a file transmission method and apparatus generates a configuration file including a file transmission path and a transmission rule according to a user input, where the file transmission path includes a source address and a destination address of the file to be transmitted, according to the configuration file, Find the file that meets the transmission rule in the source address of the file transmission path, and transfer the found file to the target address.
  • the configuration file can be used only with the Hadoop system. Mass transfer of mass files is performed, and the plurality of files transferred may be formatted data files and/or unformatted data files; the embodiment of the present invention solves the operation mode of file transfer in the manner of file transfer with the Hadoop system. Complex and less efficient problems.
  • FIG. 1 is a flowchart of a file transmission method according to an embodiment of the present invention
  • FIG. 2 is a flowchart of another file transmission method according to an embodiment of the present invention.
  • FIG. 3 is a flowchart of still another file transmission method according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of a file transmission apparatus according to an embodiment of the present disclosure.
  • FIG. 6 is a schematic structural diagram of another file transmission apparatus according to an embodiment of the present invention.
  • the application scenario of the data exchange between the terminal device and the HDFS of the Hadoop system generally includes: data transmission between the memory of the terminal device and the HDFS, and data transmission between the relational database of the Hadoop system and the HDFS in the network.
  • the above two data transmission methods are all performed by the terminal device.
  • the actual content transmitted is different types of data, and the data exists in different network ends in the form of files, and
  • the two ends of the application scenario can be uploaded to the HDFS or downloaded to the local terminal device or the relational database.
  • the storage capacity of the network can be: HDFS and local terminal devices. Store structured data files and unstructured data files, while structured data files can only be stored in relational databases.
  • the terminal device that performs the file transmission method in the following embodiments is usually a local server. Since the terminal device in the embodiments of the present invention needs to transmit massive data with the Hadoop system, the Linux-based operating system can usually be used. Server. Specifically by The following is a detailed description of the technical solutions of the present invention, and the following specific embodiments may be combined with each other, and the same or similar concepts or processes may not be described in some embodiments.
  • FIG. 1 is a flowchart of a file transmission method according to an embodiment of the present invention.
  • the method may be performed by a data transfer device, which is usually implemented in hardware and software, and the device may be integrated in The processor of the terminal device is used by the processor to call.
  • the method in this embodiment may include:
  • S110 Generate a configuration file according to the input of the user, where the configuration file includes a file transmission path and a transmission rule corresponding to the file transmission path, where the file transmission path includes a source address and a target address of the file to be transmitted.
  • the startup process of the terminal device can be started by starting a loader program on the terminal device, and the configuration interface is accessed after the heartbeat service is started, and the configuration template file for file transfer is provided on the terminal device, and the configuration template file is provided.
  • the terminal device can visually display the items to be set in the configuration template file to the user through a graphical user interface (GUI), so that the user can configure the template file in the configuration template file.
  • GUI graphical user interface
  • the configuration file is required to generate a configuration file.
  • the name of the configuration template file in this embodiment is, for example, template.transferConfig.xml, and the name of the configuration file is, for example, transferConfig.xml.
  • the content of the configuration file generated according to the user-defined input generally includes a file transmission path and a corresponding transmission rule, where the file transmission path is the two ends of the application scenario, one end is HDFS in the Hadoop system, and the other end is
  • the terminal device or the relational database that is, the file transmission path can represent the transmission direction of the massive file to be transmitted and the start point and the end point, that is, the file transmission path includes the source address and the target address of the file to be transmitted.
  • the file transfer in the embodiments of the present invention may be uploaded to the HDFS by the memory or the relational database of the terminal device, or may be downloaded from the HDFS to the memory of the terminal device or the relational database, or may be the above.
  • the combination of the two transmission directions, that is, the embodiments of the present invention do not limit the number of file transmission paths, and also does not limit the source address of the different file transmission paths and the network end to which the target address belongs.
  • the file transmission path defines the source address and the destination address of the mass file to be transmitted.
  • the number of files in the source file in the configuration file is usually very large, so it is necessary to further filter the files to be transmitted, that is, by transmitting.
  • the rule filters out the files that need to be transferred. For example, you can transfer all the images with the suffix ".jpg” in the source address to the destination address, or you can transfer all the files with the suffix ".doc" in the source address to the destination address. In the target address.
  • the foregoing transmission rules in this embodiment are in one-to-one correspondence with the file transmission path, and different file transmission paths may have different transmission rules; and the embodiment does not limit the data format of the files transmitted according to the configuration file. It can be a formatted data file, an unformatted data file, or a combination of the above two data formats.
  • the Hadoop system performs a mass file transfer process, and only needs to set a configuration file once, and all the massive files conforming to the transmission rule in the path defined by the configuration file are transmitted from the source address to the target address, and executed. Compared with the way the file is transmitted by the command line, it greatly simplifies the operation mode and improves the efficiency of file transfer.
  • the file transmission method provided in this embodiment generates a configuration file including a file transmission path and a transmission rule according to a user input, where the file transmission path includes a source address and a destination address of the file to be transmitted, and the file is transmitted according to the configuration file. Find the file that meets the transmission rule in the source address of the path, and transfer the found file to the destination address.
  • the configuration file by setting the configuration file, it is possible to perform massively with the Hadoop system only through the configuration file set by the user. Bulk transfer of files, and the plurality of files transferred may be formatted data files and/or unformatted data files; the method provided in this embodiment solves the operation mode of file transfer in the manner of file transfer with the Hadoop system. Complex and less efficient problems.
  • FIG. 2 is a flowchart of another file transmission method according to an embodiment of the present invention.
  • the transmission rule corresponding to the file transmission path is, for example, a file format.
  • S120 in this embodiment may include: S121, according to the configuration file, in the file transmission path. Finding a file matching the file format in the source address, and generating a file transfer list; S122, sequentially transferring the corresponding file in the source address to the target address according to the file transfer list.
  • the specific manner of performing massive file transfer in this embodiment is: after all the files matching the file format are found, the files in the file transfer list need to be sequentially transferred to the target address, and the traversal file transfer list can usually be used.
  • Transfer optionally, traversing to the current file in the file transfer list, determining whether the current file is being transferred, and if the current file is not the file being transferred, opening an HDFS file push thread for the current file, and placing the thread Waiting to execute the scheduling in the thread pool, and then determining whether the current file is the last file. If the current file is the last file, the current file is transferred and the configuration file is transferred. If the current file is not the last file, continue.
  • the file in the file transfer list is traversed for transmission; in addition, when it is determined that the current file is the file being transferred, it is also possible to further determine whether the current file is the last file, and perform subsequent operations according to the judgment result.
  • the file transmission path in this embodiment may be one or more.
  • the configuration file includes multiple file transmission paths and requires simultaneous transmission of files conforming to the transmission rules in each path, the configuration of the processes and threads of the terminal device is relatively high, and usually requires more and more. Thread configuration to meet the requirements for parallel file transfer.
  • the file transmission path in the configuration file may be traversed in sequence, that is, the files in the partial path are transmitted first, and then the other file transmission paths in the configuration file are traversed.
  • the search and the transmission operation are performed multiple times.
  • the file transmission path includes, for example, a first path and a second path, and the first path and the second path both include at least one file transmission path, as shown in FIG. S120 in the embodiment may be replaced by: sequentially searching for a file matching the file format corresponding to the current file transmission path in the source addresses of the first path and the second path, generating a current file transfer list, and sequentially according to the current file transfer list.
  • the number of file transmission paths traversed each time in this embodiment may be determined according to the thread configuration of the terminal device. For example, when the terminal device is in a single-process single-thread mode, one file transmission path may be traversed each time to execute. The file transmission in the path is the same as that in the foregoing embodiment, and therefore will not be described here.
  • FIG. 3 is a flowchart of still another file transmission method according to an embodiment of the present invention.
  • the configuration file further includes a verification mode, and
  • S120 in this embodiment includes: S121, and the state of the verification mode is determined according to the configuration file; If the mode is to enable the verification, then S122 is performed; if the verification mode is the closed verification, then S123 is executed.
  • the transmission rule in this embodiment may further include a file format
  • S123 may be: searching for a file matching the file format in the source address of the file transmission path, generating a file transmission list, and according to the file transmission list. The corresponding file in the source address is transferred to the destination address in turn.
  • the method further includes: S130, determining a state of the verification mode; if the verification mode is to open file processing, executing S131; if the verification mode is closing file processing, executing S132.
  • the status of the verification mode in this embodiment includes a verification status and a file processing status, and the verification status is used to control whether to perform a verification function, that is, whether to use an index verification file to search for a file to be transmitted, and a file processing status. After the file transfer is performed, it is used to control the subsequent processing operations on the index check file.
  • the specific operation manner is as shown in the above S130 to S132.
  • the file to be transmitted is searched by the file format and the file to be transmitted is searched by the index check file in the embodiment shown in FIG. 3, which is two ways of selectively searching for the file of the present invention.
  • the manner of transferring the file may also be : Transfer all files in the source address of the file transfer path to the destination address.
  • FIG. 4 is a flowchart of still another file transmission method according to an embodiment of the present invention.
  • S120 in the method provided by this embodiment includes:
  • the configuration file in this embodiment further includes a compression mode.
  • the configuration file may further include a backup mode and a backup address, which are illustrated on the basis of the embodiment shown in FIG. 4, and after performing S124 in this embodiment.
  • Also includes:
  • S130 Determine the state of the backup mode; if the backup mode is off, execute S131; if the backup mode is on, execute S132.
  • the file is stored in the backup address of the terminal device, and if the network end to which the source address belongs is the network end where the relational database is located, the file is stored in the file.
  • the network to which the source address belongs can also be a Hadoop system, and the file is stored in the backup address of the HDFS. It should be noted that the backup address can be the same as the source address or the source. The address is different.
  • the file to be transmitted is a dynamic data stream, for example, the content of the voice call or the video conference is transmitted to the HDFS
  • the file transfer operation needs to be repeatedly performed, and
  • the current configuration file can be changed by the setting of the user.
  • the specific implementation manner is as follows: the configuration file further includes a scan time interval.
  • the method further includes: performing a timing operation, and reaching a scan time at the time of the time. At interval, S120 is re-executed, at this time, according to the current configuration file, the source of the file transmission path of the current configuration file.
  • the file in the address is found to match the transmission rule of the current current configuration file, and the found file is transferred to the destination address of the file transmission path of the current configuration file. It should be noted that the method provided by each embodiment of the present invention performs timing after S120, and may perform loop S120.
  • the terminal device that performs the file transmission method can be configured as a multi-thread configuration in a multi-process mode, can perform multi-thread startup, and has large parallel execution capability when performing file transmission. Further, the transmission rate is further improved; in addition, the startup Loader program can perform the dual-machine mode of the host and the standby machine, thereby improving the reliability of the file transmission method.
  • the embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the above method.
  • the name of the configuration file generated by the user's setting is, for example, transferConfig.xml, which can be changed by the user's setting when the terminal device is restarted.
  • the contents of the configuration file transferConfig.xml include, for example:
  • TransferPath indicates the file transfer path and file backup path.
  • the file backup path is specifically the path from the source address to the backup address in the embodiment shown in Figure 4.
  • the transfer path can be filled with an absolute path or a relative path.
  • the relative path refers to the hadooploader.
  • the installation path is the installation path of the Loader program on the terminal device, and the index indicates the path index value.
  • transferRule represents the transport rule, where name represents the rule name, path represents the index of the file transfer path, corresponding to the path index value in the transferpath, file represents the file name, type represents the type of file transfer to Hadoop, specifically Hbase or HDFS, Hbase corresponding fill-in value is db, HDFS or corresponding fill-in value is hdfs, and dest fills in the specific content of file transfer;
  • Hbase is a database in Hadoop system, which can support unstructured column mode. For the data of this mode, unstructured data in HDFS can be queried through Hbase.
  • name indicates the name of the table
  • field indicates the name of the field
  • key indicates the index
  • where field indicates the field contained in the index
  • ctm indicates whether the index contains time
  • Y indicates that the content contains N.
  • Indicates not included, sqc indicates whether the index contains a serial number, Y indicates inclusion, and N indicates no inclusion.
  • the path in the dest configuration item indicates the address in the HDFS, and the region indicates the directory partition mode. If the RegionType in the Config.properties is 0, the file to be transmitted supports the day (day), week (week), and month ( Month), the year (year) storage form (/YYYYMMWWDD), the storage form is a layer storage path, or the storage file supports the storage form of days (/YYYYMMWW/DD), the storage form is a two-tier storage path Form; if the RegionType in Config.properties is 1, it means that the transferred file supports the day (day), month (month), year (year) storage form (/YYYYMMDD), or the file that supports the transfer supports days (/YYYYMM/DD). ) The form of storage.
  • the configuration file supports multiple file transmission paths, and can perform transmission operations on files in multiple file transmission paths in parallel or in turn; and supports dynamic implementation of configuration files.
  • the index verification file is, for example, an OK file.
  • the plug-in verification of the OK file is based on a plug-in verification function extended by the Loader program, and is used to provide a verification interface to complete the user's custom verification of the OK file. If the user does not configure a custom verification plugin, the verification function can be completed using the default verification provided by the terminal device.
  • FIG. 5 is a schematic structural diagram of a file transmission apparatus according to an embodiment of the present invention.
  • the file transmission apparatus is implemented in a hardware and software manner, and the apparatus may be integrated in a processor of the terminal apparatus for the processor. Called for use.
  • the file transmission apparatus of this embodiment includes: a configuration module 11 and a transmission module 12.
  • the configuration module 11 is configured to generate a configuration file according to the input of the user, where the configuration file includes a file transmission path and a transmission rule corresponding to the file transmission path, where the file transmission path includes a source address and a target address of the file to be transmitted.
  • the file transfer in the embodiments of the present invention may be uploaded to the HDFS by the memory or the relational database of the terminal device, or may be downloaded from the HDFS to the memory of the terminal device or the relational database, or may be the above.
  • the combination of the two transmission directions, that is, the embodiments of the present invention do not limit the number of file transmission paths, and also does not limit the source address of the different file transmission paths and the network end to which the target address belongs.
  • the transmission module 12 is configured to be in the file transmission path according to the configuration file generated by the configuration module 11. Find the file that meets the transmission rule in the source address of the path, and transfer the found file to the destination address.
  • the transferred file includes the formatted data file and/or the unformatted data file.
  • the file transmission device provided by the embodiment of the present invention is used to perform the file transmission method provided by the embodiment shown in FIG. 1 of the present invention, and has a corresponding function module, and the implementation principle and the technical effect thereof are similar, and details are not described herein again.
  • FIG. 6 is a schematic structural diagram of another file transmission apparatus according to an embodiment of the present invention.
  • the transmission rule corresponding to the file transmission path is, for example, a file format.
  • the transmission module 12 in this embodiment includes: a searching unit 13 configured to be in a file transmission path. Searching for a file matching the file format in the source address, and generating a file transfer list; the transfer unit 14 is configured to sequentially transfer the corresponding file in the source address to the target address according to the file transfer list generated by the lookup unit 13.
  • the file transmission path in the embodiment may be one or more, and the configuration file includes multiple file transmission paths, and may simultaneously query the files to be transmitted in each file transmission path, or may be divided into several times.
  • the file transmission path includes a first path and a second path, and the first path and the second path both include at least one file transmission path
  • the searching unit 13 is further configured to be sequentially Searching for a file matching the file format corresponding to the current file transmission path in the source address of the path and the second path, generating a current file transfer list
  • the transmitting unit 14 is further configured to sequentially perform the current file transfer list according to the search unit 13 The corresponding file in the source address of the file transfer path is transferred to the destination address.
  • the file transmission device provided by the embodiment of the present invention is used to perform the file transmission method provided by the embodiment shown in FIG. 2 of the present invention, and has a corresponding function module, and the implementation principle and technical effects thereof are similar, and details are not described herein again.
  • the configuration file further includes a verification mode
  • the transmission rule corresponding to the file transmission path may also be an index verification file
  • the transmission module 12 further includes: a determining unit 15 configured to determine a state of the verification mode; correspondingly, the searching unit 13 is further configured to be in the file transmission path when the determining unit 15 determines that the verification mode is the opening check Searching for a file corresponding to the index entry of the index check file in the source address; the transfer unit 14 is further configured to transfer the file found by the search unit 13 to the target address; in addition, if the transfer rule in the embodiment further includes the file Format, search unit 13, It is further configured to: when the determining unit 15 determines that the verification mode is the off-check, search for a file matching the file format in the source address of the file transmission path to generate a file transfer list; and the transmission unit 14 is further configured to be based on the search unit 13 The generated file transfer list in turn transfers
  • the state of the verification mode in this embodiment further includes the opening or closing of the file processing for further processing after the file transmission is performed by the transmission unit 14.
  • the determining unit 15 in the embodiment is further configured to transmit After the unit 14 transfers the file to the target address, the status of the verification mode is determined.
  • the file transmission apparatus further includes: a processing unit 16 configured to set the index when the determination unit 15 determines that the verification mode is the open file processing.
  • the verification file is stored in the preset path of the network end to which the target address belongs; the processing unit 16 is further configured to delete the index verification file or store the index verification file to the source address when the determination unit 15 determines that the verification mode is the closed file processing. In the default path of the network.
  • the file to be transmitted is searched by the index check file, and the file to be transmitted is searched for by the file format in the foregoing embodiment, which is a method for separately searching for a file provided by the embodiment of the present invention.
  • the method for transmitting the file may also be: The unit 14 is further arranged to transmit all the files in the source address of the file transfer path to the target address when the judging unit 15 judges that the check mode is the close check.
  • the file transmission device provided by the embodiment of the present invention is used to execute the file transmission method provided by the embodiment shown in FIG. 3 of the present invention, and has a corresponding function module, and the implementation principle and the technical effect thereof are similar, and details are not described herein again.
  • the configuration file in this embodiment may further include a compression mode, which is illustrated by using the structure of the file transmission apparatus shown in FIG.
  • the transmission module 12 in this embodiment is further configured to: a compression unit 17 configured to compress the found file when the determination unit 15 determines that the compression mode is on.
  • the transmission module 12 is further configured to transmit the compressed file of the compression unit 17 to the target address; in addition, the transmission module 12 is further configured to: when the determination unit 15 determines that the compression mode is off, the file to be found Transfer to the destination address.
  • the configuration file in this embodiment may further include a backup mode and a backup address, and then
  • the judging unit 15 in the file transfer device shown in FIG. 6 is further configured to determine the state of the backup mode after the transfer module 12 transfers the found file to the target address; accordingly, the processing unit 16 is further configured to be in the judging unit.
  • the processing unit 16 is further configured to set the file transfer path when the determining unit 15 determines that the backup mode is on.
  • the file whose source address has been transferred to the destination address is backed up to the backup address of the network side to which the source address belongs.
  • the file transmission device provided by the embodiment of the present invention is used to execute the file transmission method provided by the embodiment shown in FIG. 4 of the present invention, and has a corresponding function module, and the implementation principle and the technical effect thereof are similar, and details are not described herein again.
  • the file to be transmitted is a dynamic data stream, for example, the content of the voice call or the video conference is transmitted to the HDFS
  • the file transfer operation needs to be repeatedly performed, and
  • the current configuration file can be changed by the setting of the user.
  • the configuration file in this embodiment further includes a scanning time interval. Based on the structure of the file transmission device shown in FIG.
  • the device provided in this embodiment further includes:
  • the module 18 is configured to perform a timing operation after the transmission module 12 transmits the found file to the target address; correspondingly, the transmission module 12 is further configured to reach the scanning time interval timing at the timing of the timing module 18, according to the current
  • the configuration file searches for a file conforming to the transmission rule of the current configuration file in the source address of the file transfer path of the current configuration file, and transfers the found file to the destination address of the file transfer path of the current configuration file.
  • each module/unit in the above embodiment may be implemented in the form of hardware, for example, by implementing an integrated circuit to implement its corresponding function, or may be implemented in the form of a software function module, for example, executing a program stored in the memory by a processor. / instruction to achieve its corresponding function.
  • the invention is not limited to any specific form of combination of hardware and software.
  • the above technical solution realizes mass transfer of massive files with the Hadoop system only by the configuration file set by the user, and the transferred multiple files can be formatted data files and/or unformatted data files; solved in Hadoop with Hadoop In the way that the system performs file transfer, the file transfer operation mode is complicated and the efficiency is low.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

A data transfer method and apparatus. The data transfer method comprises: generating a configuration file according to input of a user, wherein the configuration file comprises a file transfer path and a transfer rule corresponding to the file transfer path, and the file transfer path comprises a source address and a destination address of a file to be transferred; and searching, according to the configuration file, for a file satisfying the transfer rule in the source address of the file transfer path, and transferring the found file to the destination address, wherein the transferred file comprises a formatted data file and/or a non-formatted data file. The technical solution solves the problems of a complex file transfer operation mode and low efficiency in the mode of file transfer with a Hadoop system.

Description

一种文件传输方法和装置File transmission method and device 技术领域Technical field
本文涉及但不限于数据存储技术领域,尤指一种文件传输方法和装置。This document relates to, but is not limited to, the field of data storage technology, and more particularly to a file transmission method and apparatus.
背景技术Background technique
随着互联网的发展,在互联网的基础上发展出的物联网已普遍应用信息交换和数据传输中,为了满足用户对海量数据的存储,通常使用Apache基金会开发的分布式系统基础架构,即海杜普(Hadoop)系统存储海量数据。With the development of the Internet, the Internet of Things developed on the basis of the Internet has been widely used in information exchange and data transmission. In order to satisfy users' storage of massive data, the distributed system infrastructure developed by the Apache Foundation is generally used. The Hadoop system stores massive amounts of data.
Hadoop系统包括分布式文件系统(Hadoop Distributed File System,简称为:HDFS)和分布式数据库(Hbase),Hadoop系统存中的数据存储方式包括以文件存储方式保存于HDFS中的结构化数据文件或非结构化数据文件,对于非结构化列模式的数据,可以Hbase进行查询;Hadoop系统作为海量数据存储的平台,用户可以通过终端设备采用执行命令行实现文件的上传或下载,例如,通过PUT指令将终端设备中存储的文件上传至Hadoop系统的HDFS中,再例如,通过GET指令将Hadoop系统的HDFS中存储的文件下载到终端设备的存储器或Hadoop系统对端的关系数据库中,在上述文件传输方式中,每次执行命令行仅能支持单个文件的传输,并且每次传输的文件的数据格式是固定的,若需要传输多个文件以及不同数据格式的文件,则需要通过多次手动输入命令行的方式才能实现。The Hadoop system includes a distributed file system (Hadoop Distributed File System, HDFS) and a distributed database (Hbase). The data storage method in the Hadoop system storage includes structured data files or files stored in HDFS in file storage mode. Structured data files, for unstructured column mode data, can be queried by Hbase; Hadoop system is a platform for massive data storage. Users can use the command line to upload or download files through terminal devices, for example, through PUT command. The files stored in the terminal device are uploaded to the HDFS of the Hadoop system. For example, the files stored in the HDFS of the Hadoop system are downloaded to the memory of the terminal device or the relational database of the Hadoop system in the peer database by using the GET command. Each execution of the command line can only support the transmission of a single file, and the data format of each transmitted file is fixed. If you need to transfer multiple files and files of different data formats, you need to manually enter the command line multiple times. The way can be achieved.
然而,相关技术在与Hadoop系统进行文件传输的方式中,由于文件传输的载体,即每个命令行仅能支持单个文件的传输,且每次传输的文件的数据格式是固定的,而导致文件传输的操作方式复杂,并且效率较低的问题。However, in the method of file transfer with the Hadoop system, the related art of the file transfer, that is, each command line can only support the transmission of a single file, and the data format of each transmitted file is fixed, resulting in a file. The operation of the transmission is complicated and the efficiency is low.
发明内容Summary of the invention
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。The following is an overview of the topics detailed in this document. This Summary is not intended to limit the scope of the claims.
本发明实施例提供了一种文件传输方法和装置,能够避免在与Hadoop 系统进行文件传输的方式中,文件传输的操作方式复杂,并且效率较低的问题。Embodiments of the present invention provide a file transmission method and apparatus, which can be avoided in dealing with Hadoop. In the way that the system performs file transfer, the file transfer operation mode is complicated and the efficiency is low.
第一方面,本发明实施例提供一种文件传输方法,包括:In a first aspect, an embodiment of the present invention provides a file transmission method, including:
根据用户的输入生成配置文件,所述配置文件中包括文件传输路径和所述文件传输路径对应的传输规则,其中,所述文件传输路径包括待传输文件的源地址和目标地址;And generating, by the user input, a configuration file, where the configuration file includes a file transmission path and a transmission rule corresponding to the file transmission path, where the file transmission path includes a source address and a target address of the file to be transmitted;
根据所述配置文件,在所述文件传输路径的源地址中查找符合所述传输规则的文件,并将所查找到的文件传输至所述目标地址中,所传输的文件包括格式化数据文件和/或非格式化数据文件。Determining, according to the configuration file, a file that meets the transmission rule in a source address of the file transmission path, and transmitting the found file to the target address, where the transmitted file includes a formatted data file and / or unformatted data files.
在第一方面的第一种可能的实现方式中,所述传输规则包括文件格式,所述在所述文件传输路径的源地址中查找符合所述传输规则的文件,并将所查找到的文件传输至所述目标地址中,包括:In a first possible implementation manner of the first aspect, the transmission rule includes a file format, where a file that meets the transmission rule is searched for in a source address of the file transmission path, and the file that is found is found Transfer to the target address, including:
在所述文件传输路径的源地址中查找与所述文件格式相匹配的文件,并生成文件传输列表;Finding a file matching the file format in a source address of the file transmission path, and generating a file transfer list;
根据所述文件传输列表依次将所述源地址中的相应文件传输至所述目标地址中。And correspondingly transferring the corresponding file in the source address to the target address according to the file transfer list.
根据第一方面的第一种可能的实现方式,在第二种可能的实现方式中,所述文件传输路径包括第一路径和第二路径,所述第一路径和所述第二路径都包括至少一条所述文件传输路径,所述在所述文件传输路径的源地址中查找与所述文件格式相匹配的文件,并生成文件传输列表;根据所述文件传输列表依次将所述源地址中的相应文件传输至所述目标地址中,包括:According to a first possible implementation manner of the first aspect, in a second possible implementation manner, the file transmission path includes a first path and a second path, where the first path and the second path are both included At least one of the file transmission paths, the source address of the file transmission path is searched for a file matching the file format, and a file transfer list is generated; and the source address is sequentially selected according to the file transfer list The corresponding file is transferred to the target address, including:
依次在所述第一路径和所述第二路径的源地址中查找与当前文件传输路径对应的文件格式相匹配的文件,生成当前文件传输列表,并根据所述当前文件传输列表依次将所述当前文件传输路径的源地址中的相应文件传输至所述目标地址中。And searching for a file matching the file format corresponding to the current file transmission path in the source addresses of the first path and the second path, generating a current file transfer list, and sequentially performing the according to the current file transfer list. A corresponding file in the source address of the current file transfer path is transferred to the target address.
在第一方面的第三种可能的实现方式中,所述配置文件还包括校验模式,所述传输规则包括索引校验文件,所述在所述文件传输路径的源地址中查找符合所述传输规则的文件,并将所查找到的文件传输至所述目标地址中,包括: In a third possible implementation manner of the first aspect, the configuration file further includes a verification mode, where the transmission rule includes an index verification file, and the searching in the source address of the file transmission path is consistent with the Transfer the file of the rule and transfer the found file to the target address, including:
在所述校验模式为开启检验时,在所述文件传输路径的源地址中查找与所述索引校验文件的索引项对应的文件,并将所述查找到的文件传输至所述目标地址中;When the verification mode is an open check, searching for a file corresponding to an index entry of the index check file in a source address of the file transfer path, and transmitting the found file to the target address in;
所述将所述文件传输至所述目标地址中之后,还包括:After the transferring the file to the target address, the method further includes:
在所述校验模式为开启文件处理时,将所述索引校验文件存储至所述目标地址所属网络端的预设路径中;When the verification mode is the file processing, the index verification file is stored in a preset path of the network end to which the target address belongs;
在所述校验模式为关闭文件处理时,删除所述索引校验文件或者将所述索引校验文件存储至所述源地址所述网络端的预设路径中。And when the verification mode is to close the file processing, deleting the index verification file or storing the index verification file in a preset path of the network end of the source address.
根据第一方面的第三种可能的实现方式,在第四种可能的实现方式中,所述传输规则还包括文件格式,所述在所述文件传输路径的源地址中查找符合所述传输规则的文件,并将所查找到的文件传输至所述目标地址中,还包括:According to a third possible implementation manner of the first aspect, in a fourth possible implementation, the transmission rule further includes a file format, where the searching in the source address of the file transmission path matches the transmission rule The file and transfer the found file to the target address, including:
在所述校验模式为关闭检验时,在所述文件传输路径的源地址中查找与所述文件格式相匹配的文件,生成文件传输列表,并根据所述文件传输列表依次将所述源地址中的相应文件传输至所述目标地址中。When the verification mode is a shutdown check, searching for a file matching the file format in a source address of the file transmission path, generating a file transfer list, and sequentially, according to the file transfer list, the source address The corresponding file in the file is transferred to the target address.
根据第一方面、第一方面的第一种到第四种可能的实现方式中任意一种,在第五种可能的实现方式中,所述配置文件还包括压缩模式,所述将所查找到的文件传输至所述目标地址,包括:According to the first aspect, any one of the first to fourth possible implementation manners of the first aspect, in a fifth possible implementation, the configuration file further includes a compressed mode, where the Transfer the file to the target address, including:
在所述压缩模式为开启时,对所述查找到的文件进行压缩处理,将压缩处理后的文件传输至所述目标地址中;When the compression mode is on, compressing the searched file, and transmitting the compressed file to the target address;
在所述压缩模式为关闭时,将所述查找到的文件传输至所述目标地址中。The searched file is transferred to the target address when the compressed mode is off.
根据第一方面、第一方面的第一种到第四种可能的实现方式中任意一种,在第六种可能的实现方式中,所述配置文件还包括备份模式和备份地址,所述将所查找到的文件传输至所述目标地址中之后,还包括:According to the first aspect, any one of the first to fourth possible implementation manners of the first aspect, in a sixth possible implementation, the configuration file further includes a backup mode and a backup address, where After the found file is transferred to the target address, it also includes:
在所述备份模式为关闭时,删除所述文件传输路径的源地址中已传输至所述目标地址的文件;When the backup mode is off, deleting a file that has been transmitted to the target address in a source address of the file transfer path;
在所述备份模式为开启时,将所述文件传输路径的源地址中已传输至所述目标地址的文件备份到所述源地址所属网络端的备份地址中。 When the backup mode is enabled, the file that has been transferred to the target address in the source address of the file transmission path is backed up to the backup address of the network end to which the source address belongs.
根据第一方面、第一方面的第一种到第四种可能的实现方式中任意一种,在第七种可能的实现方式中,所述配置文件还包括扫描时间间隔,所述在所述文件传输路径的源地址中查找符合所述传输规则的文件,并将所查找到的文件传输至所述目标地址中之后,还包括:According to the first aspect, the first to the fourth possible implementation manner of the first aspect, in a seventh possible implementation, the configuration file further includes a scanning time interval, where After searching for the file that meets the transmission rule in the source address of the file transmission path, and transferring the found file to the target address, the method further includes:
计时时间达到所述扫描时间间隔时,根据当前配置文件在所述当前配置文件的文件传输路径的源地址中查找符合所述当前配置文件的传输规则的文件,并将所查找到的文件传输至所述当前配置文件的文件传输路径的目标地址中。When the aging time reaches the scanning time interval, searching for a file that meets the transmission rule of the current configuration file in the source address of the file transmission path of the current configuration file according to the current configuration file, and transmitting the found file to The destination address of the file transfer path of the current configuration file.
第二方面,本发明实施例提供一种文件传输装置,包括:In a second aspect, an embodiment of the present invention provides a file transmission apparatus, including:
配置模块,设置为根据用户的输入生成配置文件,所述配置文件中包括文件传输路径和所述文件传输路径对应的传输规则,其中,所述文件传输路径包括待传输文件的源地址和目标地址;a configuration module, configured to generate a configuration file according to a user input, where the configuration file includes a file transmission path and a transmission rule corresponding to the file transmission path, where the file transmission path includes a source address and a destination address of the file to be transmitted ;
传输模块,设置为根据所述配置模块生成的配置文件,在所述文件传输路径的源地址中查找符合所述传输规则的文件,并将所查找到的文件传输至所述目标地址中,所传输的文件包括格式化数据文件和/或非格式化数据文件。a transmission module, configured to search for a file that meets the transmission rule in a source address of the file transmission path according to a configuration file generated by the configuration module, and transmit the found file to the target address, where The transferred files include formatted data files and/or unformatted data files.
在第二方面的第一种可能的实现方式中,所述传输规则包括文件格式,所述传输模块包括:查找单元,设置为在所述文件传输路径的源地址中查找与所述文件格式相匹配的文件,并生成文件传输列表;In a first possible implementation manner of the second aspect, the transmission rule includes a file format, and the transmission module includes: a searching unit configured to search for a source format of the file transmission path and the file format Match the files and generate a file transfer list;
传输单元,设置为根据所述查找单元生成的文件传输列表依次将所述源地址中的相应文件传输至所述目标地址中。And a transmitting unit, configured to sequentially transmit the corresponding file in the source address to the target address according to the file transfer list generated by the searching unit.
根据第二方面的第一种可能的实现方式,在第二种可能的实现方式中,所述文件传输路径包括第一路径和第二路径,所述第一路径和所述第二路径都包括至少一条所述文件传输路径,所述查找单元,是设置为依次在所述第一路径和所述第二路径的源地址中查找与当前文件传输路径对应的文件格式相匹配的文件,生成当前文件传输列表;According to a first possible implementation manner of the second aspect, in a second possible implementation, the file transmission path includes a first path and a second path, where the first path and the second path are both included At least one of the file transmission paths, the searching unit is configured to sequentially search for a file matching the file format corresponding to the current file transmission path in the source addresses of the first path and the second path, and generate a current file File transfer list;
传输单元,是设置为根据所述查找单元生成的当前文件传输列表依次将所述当前文件传输路径的源地址中的相应文件传输至所述目标地址中。 And the transmitting unit is configured to sequentially transmit the corresponding file in the source address of the current file transmission path to the target address according to the current file transmission list generated by the searching unit.
在第二方面的第三种可能的实现方式中,所述配置文件还包括校验模式,所述传输规则包括索引校验文件,所述传输模块包括:判断单元,设置为判断所述校验模式的状态;In a third possible implementation manner of the second aspect, the configuration file further includes a verification mode, the transmission rule includes an index verification file, and the transmission module includes: a determining unit, configured to determine the verification State of the mode;
查找单元,还设置为在所述判断单元判断出所述校验模式为开启校验时,在所述文件传输路径的源地址中查找与所述索引校验文件的索引项对应的文件;The searching unit is further configured to: when the determining unit determines that the verification mode is the on check, searching for a file corresponding to the index entry of the index check file in the source address of the file transfer path;
传输单元,还设置为将所述查找单元查找到的文件传输至所述目标地址中;a transmission unit, configured to transmit the file found by the searching unit to the target address;
所述文件传输模块还包括:处理单元,设置为在将文件传输至所述目标地址中之后,在所述判断单元判断出所述校验模式为开启文件处理时,将所述索引校验文件存储至所述目标地址所属网络端的预设路径中;The file transfer module further includes: a processing unit configured to: after the file unit transmits the file to the target address, when the determining unit determines that the check mode is open file processing, the index check file is Stored in a preset path to the network end to which the target address belongs;
在所述判断单元判断出所述校验模式为关闭文件处理时,删除所述索引校验文件或者将所述索引校验文件存储至所述源地址所述网络端的预设路径中。When the determining unit determines that the verification mode is closed file processing, deleting the index verification file or storing the index verification file in a preset path of the network end of the source address.
根据第二方面的第三种可能的实现方式,在第四种可能的实现方式中,所述传输规则还包括文件格式,所述查找单元,还设置为在所述判断单元判断出所述校验模式为关闭检验时,在所述文件传输路径的源地址中查找与所述文件格式相匹配的文件,生成文件传输列表。According to a third possible implementation of the second aspect, in a fourth possible implementation, the transmission rule further includes a file format, and the searching unit is further configured to determine, at the determining unit, the school When the verification mode is off verification, a file matching the file format is searched for in the source address of the file transmission path to generate a file transfer list.
根据第二方面、第二方面的第一种到第四种可能的实现方式中任意一种,在第五种可能的实现方式中,所述配置文件还包括压缩模式,所述传输模块包括:判断单元、压缩单元和传输单元;所述判断单元,设置为判断所述压缩模式的状态;According to the second aspect, any one of the first to fourth possible implementation manners of the second aspect, in the fifth possible implementation, the configuration file further includes a compressed mode, where the transmission module includes: a determining unit, a compressing unit, and a transmitting unit; the determining unit is configured to determine a state of the compressed mode;
所述压缩单元,设置为在所述判断单元的判断出所述压缩模式为开启时,对所述查找到的文件进行压缩处理;The compression unit is configured to perform compression processing on the found file when the determining unit determines that the compression mode is on;
所述传输模块,设置为将所述压缩单元压缩处理后的文件传输至所述目标地址中;在所述判断单元的判断出所述压缩模式为关闭时,将所述查找到的文件传输至所述目标地址中。The transmission module is configured to transmit the compressed file of the compression unit to the target address; and when the determining unit determines that the compression mode is off, transmit the found file to In the target address.
根据第二方面、第二方面的第一种到第四种可能的实现方式中任意一种, 在第六种可能的实现方式中,所述配置文件还包括备份模式和备份地址,所述传输模块还包括:判断单元和处理单元;所述判断单元,设置为在所述传输单元将所述查找到的文件传输至所述目标地址中之后,判断所述备份模式的状态;According to the second aspect, any one of the first to fourth possible implementations of the second aspect, In a sixth possible implementation, the configuration file further includes a backup mode and a backup address, the transmission module further includes: a determining unit and a processing unit; the determining unit is configured to be in the transmitting unit After the found file is transferred to the target address, the state of the backup mode is determined;
所述处理单元,设置为在所述判断单元判断出所述备份模式为关闭时,删除所述文件传输路径的源地址中已传输至所述目标地址的文件;The processing unit is configured to, when the determining unit determines that the backup mode is off, delete a file that has been transmitted to the target address in a source address of the file transmission path;
在所述判断单元判断出所述备份模式为开启时,将所述文件传输路径的源地址中已传输至所述目标地址的文件备份到所述源地址所属网络端的备份地址中。When the determining unit determines that the backup mode is on, back up the file that has been transmitted to the target address in the source address of the file transmission path to the backup address of the network end to which the source address belongs.
根据第二方面、第二方面的第一种到第四种可能的实现方式中任意一种,在第七种可能的实现方式中,所述配置文件还包括扫描时间间隔,所述文件传输装置还包括:计时模块,设置为在所述传输模块将所查找到的文件传输至所述目标地址中之后,执行计时操作;According to the second aspect, any one of the first to fourth possible implementation manners of the second aspect, in a seventh possible implementation, the configuration file further includes a scanning time interval, the file transmission device The method further includes: a timing module configured to perform a timing operation after the transmission module transmits the found file to the target address;
所述传输模块,还设置为在所述计时模块的计时时间达到所述扫描时间间隔计时,根据当前配置文件在所述当前配置文件的文件传输路径的源地址中查找符合所述当前配置文件的传输规则的文件,并将所查找到的文件传输至所述当前配置文件的文件传输路径的目标地址中。The transmission module is further configured to: when the timing time of the timing module reaches the scanning time interval, searching for the current configuration file according to the current configuration file in the source address of the file transmission path of the current configuration file. Transfer the file of the rule and transfer the found file to the destination address of the file transfer path of the current profile.
本发明实施例还提供了一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,所述计算机可执行指令用于执行上述的方法。The embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the above method.
本发明实施例提供的文件传输方法和装置,根据用户的输入生成包括文件传输路径和传输规则的配置文件,该文件传输路径包括待传输文件的源地址和目标地址,根据该配置文件,在上述文件传输路径的源地址中查找符合传输规则的文件,并将查找到的文件传输至目标地址中,本实施例通过设置配置文件的方式,实现了仅通过用户设置的配置文件就可以与Hadoop系统进行海量文件的批量传输,并且传输的多个文件可以为格式化数据文件和/或非格式化数据文件;本发明实施例解决了在与Hadoop系统进行文件传输的方式中,文件传输的操作方式复杂,并且效率较低的问题。 A file transmission method and apparatus according to an embodiment of the present invention generates a configuration file including a file transmission path and a transmission rule according to a user input, where the file transmission path includes a source address and a destination address of the file to be transmitted, according to the configuration file, Find the file that meets the transmission rule in the source address of the file transmission path, and transfer the found file to the target address. In this embodiment, by setting the configuration file, the configuration file can be used only with the Hadoop system. Mass transfer of mass files is performed, and the plurality of files transferred may be formatted data files and/or unformatted data files; the embodiment of the present invention solves the operation mode of file transfer in the manner of file transfer with the Hadoop system. Complex and less efficient problems.
在阅读并理解了附图和详细描述后,可以明白其他方面。Other aspects will be apparent upon reading and understanding the drawings and detailed description.
附图概述BRIEF abstract
图1为本发明实施例提供的一种文件传输方法的流程图;FIG. 1 is a flowchart of a file transmission method according to an embodiment of the present invention;
图2为本发明实施例提供的另一种文件传输方法的流程图;2 is a flowchart of another file transmission method according to an embodiment of the present invention;
图3为本发明实施例提供的又一种文件传输方法的流程图;FIG. 3 is a flowchart of still another file transmission method according to an embodiment of the present invention;
图4为本发明实施例提供的再一种文件传输方法的流程图;4 is a flowchart of still another file transmission method according to an embodiment of the present invention;
图5为本发明实施例提供的一种文件传输装置的结构示意图;FIG. 5 is a schematic structural diagram of a file transmission apparatus according to an embodiment of the present disclosure;
图6为本发明实施例提供的另一种文件传输装置的结构示意图。FIG. 6 is a schematic structural diagram of another file transmission apparatus according to an embodiment of the present invention.
本发明的实施方式Embodiments of the invention
下文中将结合附图对本发明的实施例进行详细说明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互任意组合。Embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be noted that, in the case of no conflict, the features in the embodiments and the embodiments in the present application may be arbitrarily combined with each other.
在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行。并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。The steps illustrated in the flowchart of the figures may be executed in a computer system such as a set of computer executable instructions. Also, although logical sequences are shown in the flowcharts, in some cases the steps shown or described may be performed in a different order than the ones described herein.
相关技术中,通过终端设备与Hadoop系统的HDFS进行数据交互的应用场景通常包括:终端设备的存储器与HDFS之间的数据传输,网络中与Hadoop系统对端的关系型数据库与HDFS之间的数据传输,上述两种数据传输方式都是由终端设备执行,本发明以下实施例所述的文件传输中,传输的实际内容是不同类型的数据,数据以文件的形式存在于不同的网络端,并且可以在上述应用场景的两端进行交叉传输,即可以上传至HDFS,也可以由HDFS下载到本地终端设备或者关系型数据库;可选地,上述网络端的存储能力分别为:HDFS和本地终端设备均可以存储结构化数据文件和非结构化数据文件,而关系型数据库中仅能存储结构化数据文件。In the related art, the application scenario of the data exchange between the terminal device and the HDFS of the Hadoop system generally includes: data transmission between the memory of the terminal device and the HDFS, and data transmission between the relational database of the Hadoop system and the HDFS in the network. The above two data transmission methods are all performed by the terminal device. In the file transmission described in the following embodiments of the present invention, the actual content transmitted is different types of data, and the data exists in different network ends in the form of files, and The two ends of the application scenario can be uploaded to the HDFS or downloaded to the local terminal device or the relational database. The storage capacity of the network can be: HDFS and local terminal devices. Store structured data files and unstructured data files, while structured data files can only be stored in relational databases.
需要说明的是,以下各本实施例中执行文件传输方法的终端设备通常为本地服务器,由于本发明各实施例中的终端设备需要与Hadoop系统进行海量数据的传输,通常可以使用基于Linux操作系统的服务器。下面通过具体地 实施例对本发明的技术方案进行详细说明,本发明提供的以下几个具体的实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例不再赘述。It should be noted that the terminal device that performs the file transmission method in the following embodiments is usually a local server. Since the terminal device in the embodiments of the present invention needs to transmit massive data with the Hadoop system, the Linux-based operating system can usually be used. Server. Specifically by The following is a detailed description of the technical solutions of the present invention, and the following specific embodiments may be combined with each other, and the same or similar concepts or processes may not be described in some embodiments.
图1为本发明实施例提供的一种文件传输方法的流程图。本实施例提供的文件传输方法适用于与Hadoop系统的HDFS进行文件传输的情况中,该方法可以由数据传输装置执行,该数据传输装置通常以硬件和软件的方式来实现,该装置可以集成在终端设备的处理器中,供处理器调用使用。如图1所示,本实施例的方法可以包括:FIG. 1 is a flowchart of a file transmission method according to an embodiment of the present invention. In the case where the file transfer method provided in this embodiment is applicable to file transfer with the HDFS of the Hadoop system, the method may be performed by a data transfer device, which is usually implemented in hardware and software, and the device may be integrated in The processor of the terminal device is used by the processor to call. As shown in FIG. 1, the method in this embodiment may include:
S110,根据用户的输入生成配置文件,该配置文件中包括文件传输路径和该文件传输路径对应的传输规则,其中,文件传输路径包括待传输文件的源地址和目标地址。S110: Generate a configuration file according to the input of the user, where the configuration file includes a file transmission path and a transmission rule corresponding to the file transmission path, where the file transmission path includes a source address and a target address of the file to be transmitted.
在本实施例中,可以通过启动终端设备上的加载(Loader)程序开启终端设备的启动流程,并在启动心跳服务后进入设置界面,终端设备上提供文件传输的配置模板文件,该配置模板文件包括各项需要设置的配置项目,终端设备可以通过图形用户界面(Graphical User Interface,简称为:GUI)将配置模板文件的中需要设置的项目直观的向用户展示出来,以便用户在该配置模板文件中定义对文件传输的要求,从而生成配置文件,本实施例中的配置模板文件的名称例如为template.transferConfig.xml,配置文件的名称例如为transferConfig.xml。In this embodiment, the startup process of the terminal device can be started by starting a loader program on the terminal device, and the configuration interface is accessed after the heartbeat service is started, and the configuration template file for file transfer is provided on the terminal device, and the configuration template file is provided. Including various configuration items that need to be set, the terminal device can visually display the items to be set in the configuration template file to the user through a graphical user interface (GUI), so that the user can configure the template file in the configuration template file. The configuration file is required to generate a configuration file. The name of the configuration template file in this embodiment is, for example, template.transferConfig.xml, and the name of the configuration file is, for example, transferConfig.xml.
在具体实现中,根据用户自定义输入生成的配置文件的内容通常包括文件传输路径和对应的传输规则,该文件传输路径即上述应用场景中的两端,一端为Hadoop系统中的HDFS,另一端为终端设备或关系型数据库,也就是说,文件传输路径可以体现待传输的海量文件的传输方向以及起点和终点,即该文件传输路径包括待传输文件的源地址和目标地址。In a specific implementation, the content of the configuration file generated according to the user-defined input generally includes a file transmission path and a corresponding transmission rule, where the file transmission path is the two ends of the application scenario, one end is HDFS in the Hadoop system, and the other end is The terminal device or the relational database, that is, the file transmission path can represent the transmission direction of the massive file to be transmitted and the start point and the end point, that is, the file transmission path includes the source address and the target address of the file to be transmitted.
需要说明的是,本发明各实施例中的文件传输可以由终端设备的存储器或关系型数据库向HDFS上传文件,也可以是由HDFS向终端设备的存储器或关系型数据库下载文件,还可以是上述两种传输方向的结合,即本发明各实施例不限制文件传输路径的数量,同样不限制不同文件传输路径的源地址和目标地址所属的网络端。 It should be noted that the file transfer in the embodiments of the present invention may be uploaded to the HDFS by the memory or the relational database of the terminal device, or may be downloaded from the HDFS to the memory of the terminal device or the relational database, or may be the above. The combination of the two transmission directions, that is, the embodiments of the present invention do not limit the number of file transmission paths, and also does not limit the source address of the different file transmission paths and the network end to which the target address belongs.
S120,根据配置文件,在文件传输路径的源地址中查找符合传输规则的文件,并将所查找到的文件传输至目标地址中,所传输的文件包括格式化数据文件和/或非格式化数据文件。S120. Search for a file conforming to the transmission rule in the source address of the file transmission path according to the configuration file, and transfer the found file to the target address, where the transmitted file includes the formatted data file and/or the unformatted data. file.
在本实施例中,文件传输路径定义了待传输的海量文件的源地址和目标地址,配置文件中源地址的文件数量通常是非常大的,因此需要进一步地筛选需要传输的文件,即通过传输规则过滤出需要传输的文件,例如可以是将源地址中所有后缀名为“.jpg”的图片传输至目标地址中,还可以是将源地址中所有后缀名为“.doc”的文件传输至目标地址中。In this embodiment, the file transmission path defines the source address and the destination address of the mass file to be transmitted. The number of files in the source file in the configuration file is usually very large, so it is necessary to further filter the files to be transmitted, that is, by transmitting. The rule filters out the files that need to be transferred. For example, you can transfer all the images with the suffix ".jpg" in the source address to the destination address, or you can transfer all the files with the suffix ".doc" in the source address to the destination address. In the target address.
需要说明的是,本实施例中的上述传输规则是与文件传输路径一一对应的,不同的文件传输路径可以有不同的传输规则;并且本实施例不限制根据配置文件所传输文件的数据格式,可以为格式化数据文件,也可以为非格式化数据文件,还可以为上述两种数据格式的组合。另外,本实施例与Hadoop系统进行海量文件的传输过程,仅需要设置一次配置文件就可以将该配置文件所定义路径下所有符合传输规则的海量文件由源地址传输至目标地址中,与通过执行命令行进行文件传输的方式相比,很大程度上简化了操作方式,并且提高了文件传输的效率。It should be noted that the foregoing transmission rules in this embodiment are in one-to-one correspondence with the file transmission path, and different file transmission paths may have different transmission rules; and the embodiment does not limit the data format of the files transmitted according to the configuration file. It can be a formatted data file, an unformatted data file, or a combination of the above two data formats. In addition, in this embodiment, the Hadoop system performs a mass file transfer process, and only needs to set a configuration file once, and all the massive files conforming to the transmission rule in the path defined by the configuration file are transmitted from the source address to the target address, and executed. Compared with the way the file is transmitted by the command line, it greatly simplifies the operation mode and improves the efficiency of file transfer.
本实施例所提供的文件传输方法,根据用户的输入生成包括文件传输路径和传输规则的配置文件,该文件传输路径包括待传输文件的源地址和目标地址,根据该配置文件,在上述文件传输路径的源地址中查找符合传输规则的文件,并将查找到的文件传输至目标地址中,本实施例通过设置配置文件的方式,实现了仅通过用户设置的配置文件就可以与Hadoop系统进行海量文件的批量传输,并且传输的多个文件可以为格式化数据文件和/或非格式化数据文件;本实施例提供的方法解决了在与Hadoop系统进行文件传输的方式中,文件传输的操作方式复杂,并且效率较低的问题。The file transmission method provided in this embodiment generates a configuration file including a file transmission path and a transmission rule according to a user input, where the file transmission path includes a source address and a destination address of the file to be transmitted, and the file is transmitted according to the configuration file. Find the file that meets the transmission rule in the source address of the path, and transfer the found file to the destination address. In this embodiment, by setting the configuration file, it is possible to perform massively with the Hadoop system only through the configuration file set by the user. Bulk transfer of files, and the plurality of files transferred may be formatted data files and/or unformatted data files; the method provided in this embodiment solves the operation mode of file transfer in the manner of file transfer with the Hadoop system. Complex and less efficient problems.
可选地,图2为本发明实施例提供的另一种文件传输方法的流程图。本实施例中与文件传输路径对应的传输规则例如为文件格式,则在上述图1所示实施例的基础上,本实施例中的S120可以包括:S121,根据配置文件,在文件传输路径的源地址中查找与文件格式相匹配的文件,并生成文件传输列表;S122,根据文件传输列表依次将源地址中的相应文件传输至目标地址中。 本实施例执行海量文件传输的具体方式为:在查找到与文件格式相匹配的所有文件后,需要依次将文件传输列表中的文件传输至目标地址中,通常可以采用遍历文件传输列表的方式进行传输,可选地,遍历到文件传输列表中的当前文件时,判断该当前文件是否正在传输,若该当前文件不是正在传输的文件,则为当前文件开辟HDFS文件推送线程,并将该线程放入线程池中等待执行调度,随后判断当前文件是否为最后一个文件,若该当前文件为最后一个文件则传输该当前文件后完成配置文件的传输任务,若该当前文件不是最后一个文件,则继续遍历文件传输列表中的文件进行传输;另外,在上述判断当前文件为正在传输的文件时,同样可以进一步判断该当前文件是否为最后一个文件,并根据判断结果执行后续的操作。Optionally, FIG. 2 is a flowchart of another file transmission method according to an embodiment of the present invention. In this embodiment, the transmission rule corresponding to the file transmission path is, for example, a file format. On the basis of the embodiment shown in FIG. 1 , S120 in this embodiment may include: S121, according to the configuration file, in the file transmission path. Finding a file matching the file format in the source address, and generating a file transfer list; S122, sequentially transferring the corresponding file in the source address to the target address according to the file transfer list. The specific manner of performing massive file transfer in this embodiment is: after all the files matching the file format are found, the files in the file transfer list need to be sequentially transferred to the target address, and the traversal file transfer list can usually be used. Transfer, optionally, traversing to the current file in the file transfer list, determining whether the current file is being transferred, and if the current file is not the file being transferred, opening an HDFS file push thread for the current file, and placing the thread Waiting to execute the scheduling in the thread pool, and then determining whether the current file is the last file. If the current file is the last file, the current file is transferred and the configuration file is transferred. If the current file is not the last file, continue. The file in the file transfer list is traversed for transmission; in addition, when it is determined that the current file is the file being transferred, it is also possible to further determine whether the current file is the last file, and perform subsequent operations according to the judgment result.
在具体实现中,本实施例中的文件传输路径可以是一条或多条,一方面,若配置文件中仅包括一条文件传输路径,则仅传输该路径下所有符合传输规则的文件,则对终端设备的进程和线程的配置要求不高。另一方面,若配置文件中包括多条文件传输路径,且要求同时传输每条路径下的符合传输规则的文件,则对终端设备的进程和线程的配置要求较高,通常要求多进行和多线程配置,以满足并行执行文件传输的要求。又一方面,若配置文件中包括多条文件传输路径,还可以是依次遍历配置文件中的部分文件传输路径,即对该部分路径下的文件先进行传输,再遍历配置文件中其它文件传输路径,多次执行查找和传输操作,具体实现方式为,文件传输路径中例如包括第一路径和第二路径,且该第一路径和第二路径都包括至少一条文件传输路径,则图1所示实施例中的S120可以替换为:依次在第一路径和第二路径的源地址中查找与当前文件传输路径对应的文件格式相匹配的文件,生成当前文件传输列表,并根据当前文件传输列表依次将当前文件传输路径的源地址中的相应文件传输至目标地址中。需要说明的是,本实施例中每次遍历的文件传输路径的数量可以根据终端设备的线程配置决定,例如在终端设备为单进程单线程的模式时,可以每次遍历一条文件传输路径以执行该路径下的文件传输,具体传输方式与上述实施例相同,故在此不再赘述。In a specific implementation, the file transmission path in this embodiment may be one or more. On the one hand, if only one file transmission path is included in the configuration file, only all files in the path that meet the transmission rule are transmitted, and the terminal is The process and thread configuration requirements of the device are not high. On the other hand, if the configuration file includes multiple file transmission paths and requires simultaneous transmission of files conforming to the transmission rules in each path, the configuration of the processes and threads of the terminal device is relatively high, and usually requires more and more. Thread configuration to meet the requirements for parallel file transfer. On the other hand, if the configuration file includes multiple file transmission paths, the file transmission path in the configuration file may be traversed in sequence, that is, the files in the partial path are transmitted first, and then the other file transmission paths in the configuration file are traversed. The search and the transmission operation are performed multiple times. The specific implementation manner is that the file transmission path includes, for example, a first path and a second path, and the first path and the second path both include at least one file transmission path, as shown in FIG. S120 in the embodiment may be replaced by: sequentially searching for a file matching the file format corresponding to the current file transmission path in the source addresses of the first path and the second path, generating a current file transfer list, and sequentially according to the current file transfer list. Transfer the corresponding file in the source address of the current file transfer path to the destination address. It should be noted that the number of file transmission paths traversed each time in this embodiment may be determined according to the thread configuration of the terminal device. For example, when the terminal device is in a single-process single-thread mode, one file transmission path may be traversed each time to execute. The file transmission in the path is the same as that in the foregoing embodiment, and therefore will not be described here.
如图3所述,为本发明实施例提供的又一种文件传输方法的流程图。在图1所示实施例的另一种可能的实现方式中,配置文件还包括校验模式,与 文件传输路径对应的传输规则例如索引校验文件,则在上述图1所示实施例的基础上,本实施例中的S120包括:S121,根据配置文件,判断校验模式的状态;若校验模式为开启校验,则执行S122;若校验模式为关闭检验,则执行S123。FIG. 3 is a flowchart of still another file transmission method according to an embodiment of the present invention. In another possible implementation manner of the embodiment shown in FIG. 1, the configuration file further includes a verification mode, and For the transmission rule corresponding to the file transmission path, for example, the index verification file, on the basis of the embodiment shown in FIG. 1 , S120 in this embodiment includes: S121, and the state of the verification mode is determined according to the configuration file; If the mode is to enable the verification, then S122 is performed; if the verification mode is the closed verification, then S123 is executed.
S122,在文件传输路径的源地址中查找与索引校验文件的索引项对应的文件,并将查找到的文件传输至目标地址中。S122. Search for a file corresponding to the index entry of the index check file in the source address of the file transfer path, and transfer the found file to the target address.
需要说明的是,本实施例中的传输规则还可以包括文件格式,则S123可以为:在文件传输路径的源地址中查找与文件格式相匹配的文件,生成文件传输列表,并根据文件传输列表依次将源地址中的相应文件传输至目标地址中。It should be noted that the transmission rule in this embodiment may further include a file format, and S123 may be: searching for a file matching the file format in the source address of the file transmission path, generating a file transmission list, and according to the file transmission list. The corresponding file in the source address is transferred to the destination address in turn.
在本实施例中,执行S122或S123之后,还包括:S130,判断校验模式的状态;若校验模式为开启文件处理,则执行S131;若校验模式为关闭文件处理,则执行S132。In this embodiment, after executing S122 or S123, the method further includes: S130, determining a state of the verification mode; if the verification mode is to open file processing, executing S131; if the verification mode is closing file processing, executing S132.
S131,将索引校验文件存储至目标地址所属网络端的预设路径中。S131. Store the index verification file in a preset path of the network end to which the target address belongs.
S132,删除索引校验文件或者将索引校验文件存储至源地址网络端的预设路径中。S132. Delete the index verification file or store the index verification file in a preset path of the source address network.
需要说明的是,本实施例中的检验模式的状态包括校验状态和文件处理状态,检验状态用于控制是否执行校验功能,即是否采用索引校验文件查找待传输的文件,文件处理状态在执行文件传输后,用于控制对索引校验文件的后续处理操作,具体操作方式如上述S130~S132所示。It should be noted that the status of the verification mode in this embodiment includes a verification status and a file processing status, and the verification status is used to control whether to perform a verification function, that is, whether to use an index verification file to search for a file to be transmitted, and a file processing status. After the file transfer is performed, it is used to control the subsequent processing operations on the index check file. The specific operation manner is as shown in the above S130 to S132.
还需要说明的是,图2所示实施例中通过文件格式查找待传输文件与图3所示实施例中通过索引检验文件查找待传输文件,为本发明的两种可选地查找文件的方式,通常为择一选择的,在本实施例的另一种可能的实现方式中,若传输规则中并没有文件格式,并且在S121判断出检验模式为关闭检验时,传输文件的方式还可以为:将文件传输路径的源地址中的所有文件传输至目标地址中。It should be noted that, in the embodiment shown in FIG. 2, the file to be transmitted is searched by the file format and the file to be transmitted is searched by the index check file in the embodiment shown in FIG. 3, which is two ways of selectively searching for the file of the present invention. In another possible implementation manner of this embodiment, if there is no file format in the transmission rule, and the verification mode is determined to be the shutdown check in S121, the manner of transferring the file may also be : Transfer all files in the source address of the file transfer path to the destination address.
可选地,图4为本发明实施例提供的再一种文件传输方法的流程图,在本发明上述各实施例的基础上,本实施例提供的方法中的S120包括: Optionally, FIG. 4 is a flowchart of still another file transmission method according to an embodiment of the present invention. On the basis of the foregoing embodiments of the present invention, S120 in the method provided by this embodiment includes:
S121,根据配置文件,在文件传输路径的源地址中查找符合传输规则的文件,本实施例中的配置文件还包括压缩模式。S121. Search for a file that meets the transmission rule in the source address of the file transmission path according to the configuration file. The configuration file in this embodiment further includes a compression mode.
S122,判断压缩模式的状态;若压缩模式为开启,则执行S123;若压缩模式为关闭,则执行S124。S122, determining the state of the compressed mode; if the compressed mode is on, executing S123; if the compressed mode is off, executing S124.
S123,对查找到的文件进行压缩处理,例如可以对文件进行LZO(Lempel-Ziv-Oberhumer)算法压缩。从而同样需要执行S124,将文件传输至目标地址中,此时传输至目标地址中的文件为进行压缩处理后的文件。需要说明的是,本实施例以在上述图1所示实施例的基础上为例进行的进一步描述。S123, compressing the found file, for example, performing LZO (Lempel-Ziv-Oberhumer) algorithm compression on the file. Therefore, it is also necessary to execute S124 to transfer the file to the target address, and the file transferred to the target address is the compressed file. It should be noted that the present embodiment is further described by taking the example shown in FIG. 1 as an example.
可选地,在本发明上述各实施例的基础上,配置文件还可以包括备份模式和备份地址,以在图4所示实施例的基础上为例予以示出,本实施例在执行S124之后,还包括:Optionally, on the basis of the foregoing embodiments of the present invention, the configuration file may further include a backup mode and a backup address, which are illustrated on the basis of the embodiment shown in FIG. 4, and after performing S124 in this embodiment. ,Also includes:
S130,判断备份模式的状态;若备份模式为关闭,则执行S131;若备份模式为开启,则执行S132。S130: Determine the state of the backup mode; if the backup mode is off, execute S131; if the backup mode is on, execute S132.
S131,删除文件传输路径的源地址中已传输至目标地址的文件。S131. Delete the file that has been transferred to the destination address in the source address of the file transfer path.
S132,将文件传输路径的源地址中已传输至目标地址的文件备份到源地址所属网络端的备份地址中。S132. Back up the file that has been transferred to the destination address in the source address of the file transmission path to the backup address of the network end to which the source address belongs.
在本实施例中,若源地址所属网络端为本地终端设备,则将上述文件存储于终端设备的备份地址中,若源地址所属网络端为关系型数据库所在网络端,则将上述文件存储于关系型数据库的备份地址中,该源地址所属网络端还可以为Hadoop系统,则将上述文件存储于HDFS的备份地址中;需要说明的是,该备份地址可以与源地址相同,也可以与源地址不同。In this embodiment, if the network end to which the source address belongs is the local terminal device, the file is stored in the backup address of the terminal device, and if the network end to which the source address belongs is the network end where the relational database is located, the file is stored in the file. In the backup address of the relational database, the network to which the source address belongs can also be a Hadoop system, and the file is stored in the backup address of the HDFS. It should be noted that the backup address can be the same as the source address or the source. The address is different.
可选地,在本发明上述各实施例的基础上,若需要传输的文件为动态的数据流,例如将语音通话或视频会议的内容传输至HDFS中,则需要重复执行文件传输的操作,并且可以通过用户的设置改变当前的配置文件,具体实现方式为:配置文件中还包括扫描时间间隔,本实施例提供的方法中,在S120之后还包括:执行计时操作,并在计时时间达到扫描时间间隔时,重新执行S120,此时,具体根据当前配置文件在该当前配置文件的文件传输路径的源 地址中查找符合当前当前配置文件的传输规则的文件,并将所查找到的文件传输至当前配置文件的文件传输路径的目标地址中。需要说明的是,本发明各实施例提供的方法,在S120之后进行计时,并且可以循环执行S120。Optionally, on the basis of the foregoing embodiments of the present invention, if the file to be transmitted is a dynamic data stream, for example, the content of the voice call or the video conference is transmitted to the HDFS, the file transfer operation needs to be repeatedly performed, and The current configuration file can be changed by the setting of the user. The specific implementation manner is as follows: the configuration file further includes a scan time interval. In the method provided in this embodiment, after S120, the method further includes: performing a timing operation, and reaching a scan time at the time of the time. At interval, S120 is re-executed, at this time, according to the current configuration file, the source of the file transmission path of the current configuration file. The file in the address is found to match the transmission rule of the current current configuration file, and the found file is transferred to the destination address of the file transmission path of the current configuration file. It should be noted that the method provided by each embodiment of the present invention performs timing after S120, and may perform loop S120.
需要说明的是,执行本发明各实施例提供的文件传输方法的终端设备可以设置为多进程模式下的多线程配置,能够执行多线程启动,并且在执行文件传输时具有较大的并行执行能力,进一步提高了传输速率;另外,开启启动的Loader程序可以执行主机和备机的双机模式,提高了文件传输方法的可靠性。It should be noted that the terminal device that performs the file transmission method provided by the embodiments of the present invention can be configured as a multi-thread configuration in a multi-process mode, can perform multi-thread startup, and has large parallel execution capability when performing file transmission. Further, the transmission rate is further improved; in addition, the startup Loader program can perform the dual-machine mode of the host and the standby machine, thereby improving the reliability of the file transmission method.
本发明实施例还提供了一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,所述计算机可执行指令用于执行上述的方法。The embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the above method.
以下具体对本发明上述各实施例中配置文件的具体内容作以说明,通过用户的设置生成的配置文件的名称例如为transferConfig.xml,该配置文件可以在终端设备重启时,通过用户的设置进行更改,配置文件transferConfig.xml的内容例如包括:The specific content of the configuration file in the foregoing embodiments of the present invention is specifically described below. The name of the configuration file generated by the user's setting is, for example, transferConfig.xml, which can be changed by the user's setting when the terminal device is restarted. The contents of the configuration file transferConfig.xml include, for example:
1、transferPath:表示文件传输路径和文件备份路径,文件备份路径具体为图4所示实施例中从源地址到备份地址的路径,transferPath中可以填写绝对路径或者相对路径,相对路径是指hadooploader所在安装路径,即为终端设备上的Loader程序的安装路径,index表示路径索引值。1. TransferPath: indicates the file transfer path and file backup path. The file backup path is specifically the path from the source address to the backup address in the embodiment shown in Figure 4. The transfer path can be filled with an absolute path or a relative path. The relative path refers to the hadooploader. The installation path is the installation path of the Loader program on the terminal device, and the index indicates the path index value.
2、transferRule表示传输规则,其中,name表示规则名称,path表示文件传输路径的索引,与transferpath中的路径索引值对应,file表示文件名称,type表示文件传输至Hadoop中的类型,具体为Hbase或者HDFS,Hbase对应的填写值为db,HDFS或者对应的填写值为hdfs,dest填写表示文件传输的具体内容;需要说明的是,Hbase为Hadoop系统中的数据库,可以支持非结构化的列模式,对于该模式的数据,可以通过Hbase查询HDFS中的非结构化数据。2, transferRule represents the transport rule, where name represents the rule name, path represents the index of the file transfer path, corresponding to the path index value in the transferpath, file represents the file name, type represents the type of file transfer to Hadoop, specifically Hbase or HDFS, Hbase corresponding fill-in value is db, HDFS or corresponding fill-in value is hdfs, and dest fills in the specific content of file transfer; it should be noted that Hbase is a database in Hadoop system, which can support unstructured column mode. For the data of this mode, unstructured data in HDFS can be queried through Hbase.
3、对于db类型:name表示表名,field表示字段名,key表示索引,其中field表示索引包含的字段,ctm表示索引是否包含时间,Y表示包含,N 表示不包含,sqc表示索引是否包含序列号,Y表示包含,N表示不包含。3. For db type: name indicates the name of the table, field indicates the name of the field, key indicates the index, where field indicates the field contained in the index, ctm indicates whether the index contains time, and Y indicates that the content contains N. Indicates not included, sqc indicates whether the index contains a serial number, Y indicates inclusion, and N indicates no inclusion.
4、对于hdfs类型:dest配置项中的path表示HDFS中的地址,region表示目录分区方式,如果Config.properties中RegionType为0,表示传输的文件支持天(day),周(week),月(month),年(year)的存储形式(/YYYYMMWWDD),该存储形式为一层存储路径形式,或者表示传输的文件支持days的存储形式(/YYYYMMWW/DD),该存储形式为二层存储路径形式;如果Config.properties中RegionType为1,表示传输的文件支持天(day),月(month),年(year)的存储形式(/YYYYMMDD),或者表示传输的文件支持days(/YYYYMM/DD)的存储形式。4. For the hdfs type: the path in the dest configuration item indicates the address in the HDFS, and the region indicates the directory partition mode. If the RegionType in the Config.properties is 0, the file to be transmitted supports the day (day), week (week), and month ( Month), the year (year) storage form (/YYYYMMWWDD), the storage form is a layer storage path, or the storage file supports the storage form of days (/YYYYMMWW/DD), the storage form is a two-tier storage path Form; if the RegionType in Config.properties is 1, it means that the transferred file supports the day (day), month (month), year (year) storage form (/YYYYMMDD), or the file that supports the transfer supports days (/YYYYMM/DD). ) The form of storage.
5、配置文件中支持多条文件传输路径,可以并行或者依次对多条文件传输路径中的文件执行传输操作;并且支持配置文件的动态生效。5. The configuration file supports multiple file transmission paths, and can perform transmission operations on files in multiple file transmission paths in parallel or in turn; and supports dynamic implementation of configuration files.
6、索引校验文件例如为OK文件,该OK文件的插件式校验是基于Loader程序所扩充的一个插件式校验功能,用于提供校验接口完成用户对OK文件的自定义校验;如果用户未配置自定义校验插件,可以使用终端设备提供的默认校验完成校验功能。6. The index verification file is, for example, an OK file. The plug-in verification of the OK file is based on a plug-in verification function extended by the Loader program, and is used to provide a verification interface to complete the user's custom verification of the OK file. If the user does not configure a custom verification plugin, the verification function can be completed using the default verification provided by the terminal device.
图5为本发明实施例提供的一种文件传输装置的结构示意图。本实施例提供的文件传输装置适用于与Hadoop系统的HDFS进行文件传输的情况中,该文件传输装置以硬件和软件的方式来实现,该装置可以集成在终端设备的处理器中,供处理器调用使用。如图5所示,本实施例的文件传输装置包括:配置模块11和传输模块12。FIG. 5 is a schematic structural diagram of a file transmission apparatus according to an embodiment of the present invention. In the case where the file transmission apparatus provided in this embodiment is suitable for file transmission with the HDFS of the Hadoop system, the file transmission apparatus is implemented in a hardware and software manner, and the apparatus may be integrated in a processor of the terminal apparatus for the processor. Called for use. As shown in FIG. 5, the file transmission apparatus of this embodiment includes: a configuration module 11 and a transmission module 12.
其中,配置模块11,设置为根据用户的输入生成配置文件,该配置文件中包括文件传输路径和文件传输路径对应的传输规则,其中,文件传输路径包括待传输文件的源地址和目标地址。The configuration module 11 is configured to generate a configuration file according to the input of the user, where the configuration file includes a file transmission path and a transmission rule corresponding to the file transmission path, where the file transmission path includes a source address and a target address of the file to be transmitted.
需要说明的是,本发明各实施例中的文件传输可以由终端设备的存储器或关系型数据库向HDFS上传文件,也可以是由HDFS向终端设备的存储器或关系型数据库下载文件,还可以是上述两种传输方向的结合,即本发明各实施例不限制文件传输路径的数量,同样不限制不同文件传输路径的源地址和目标地址所属的网络端。It should be noted that the file transfer in the embodiments of the present invention may be uploaded to the HDFS by the memory or the relational database of the terminal device, or may be downloaded from the HDFS to the memory of the terminal device or the relational database, or may be the above. The combination of the two transmission directions, that is, the embodiments of the present invention do not limit the number of file transmission paths, and also does not limit the source address of the different file transmission paths and the network end to which the target address belongs.
传输模块12,设置为根据配置模块11生成的配置文件,在文件传输路 径的源地址中查找符合传输规则的文件,并将所查找到的文件传输至目标地址中,所传输的文件包括格式化数据文件和/或非格式化数据文件。The transmission module 12 is configured to be in the file transmission path according to the configuration file generated by the configuration module 11. Find the file that meets the transmission rule in the source address of the path, and transfer the found file to the destination address. The transferred file includes the formatted data file and/or the unformatted data file.
本发明实施例提供的文件传输装置用于执行本发明图1所示实施例提供的文件传输方法,具备相应的功能模块,其实现原理和技术效果类似,此处不再赘述。The file transmission device provided by the embodiment of the present invention is used to perform the file transmission method provided by the embodiment shown in FIG. 1 of the present invention, and has a corresponding function module, and the implementation principle and the technical effect thereof are similar, and details are not described herein again.
可选地,图6为本发明实施例提供的另一种文件传输装置的结构示意图。本实施例中与文件传输路径对应的传输规则例如为文件格式,则在上述图5所示实施例的基础上,本实施例中的传输模块12包括:查找单元13,设置为在文件传输路径的源地址中查找与文件格式相匹配的文件,并生成文件传输列表;传输单元14,设置为根据查找单元13生成的文件传输列表依次将源地址中的相应文件传输至目标地址中。Optionally, FIG. 6 is a schematic structural diagram of another file transmission apparatus according to an embodiment of the present invention. In this embodiment, the transmission rule corresponding to the file transmission path is, for example, a file format. On the basis of the embodiment shown in FIG. 5, the transmission module 12 in this embodiment includes: a searching unit 13 configured to be in a file transmission path. Searching for a file matching the file format in the source address, and generating a file transfer list; the transfer unit 14 is configured to sequentially transfer the corresponding file in the source address to the target address according to the file transfer list generated by the lookup unit 13.
在具体实现中,本实施例中的文件传输路径可以是一条或多条,配置文件中包括多条文件传输路径,可以是同时查询每条文件传输路径中待传输的文件,也可以是分次执行的,可选地,文件传输路径例如包括第一路径和第二路径,且该第一路径和第二路径都包括至少一条文件传输路径,则上述查找单元13,还设置为依次在第一路径和第二路径的源地址中查找与当前文件传输路径对应的文件格式相匹配的文件,生成当前文件传输列表;传输单元14,还设置为根据查找单元13生成的当前文件传输列表依次将当前文件传输路径的源地址中的相应文件传输至目标地址中。In a specific implementation, the file transmission path in the embodiment may be one or more, and the configuration file includes multiple file transmission paths, and may simultaneously query the files to be transmitted in each file transmission path, or may be divided into several times. Executing, optionally, the file transmission path includes a first path and a second path, and the first path and the second path both include at least one file transmission path, and the searching unit 13 is further configured to be sequentially Searching for a file matching the file format corresponding to the current file transmission path in the source address of the path and the second path, generating a current file transfer list; and the transmitting unit 14 is further configured to sequentially perform the current file transfer list according to the search unit 13 The corresponding file in the source address of the file transfer path is transferred to the destination address.
本发明实施例提供的文件传输装置用于执行本发明图2所示实施例提供的文件传输方法,具备相应的功能模块,其实现原理和技术效果类似,此处不再赘述。The file transmission device provided by the embodiment of the present invention is used to perform the file transmission method provided by the embodiment shown in FIG. 2 of the present invention, and has a corresponding function module, and the implementation principle and technical effects thereof are similar, and details are not described herein again.
在本发明实施例的另一种可能的实现方式中,配置文件还包括校验模式,与文件传输路径对应的传输规则还可以为索引校验文件,则在上述图6所示文件传输装置的基础上,传输模块12还包括:判断单元15,设置为判断校验模式的状态;相应地,查找单元13,还设置为在判断单元15判断出检验模式为开启检验时,在文件传输路径的源地址中查找与索引校验文件的索引项对应的文件;传输单元14,还设置为将查找单元13查找到的文件传输至目标地址中;另外,若本实施例中的传输规则还包括文件格式,查找单元13, 还设置为在判断单元15判断出检验模式为关闭校验时,在文件传输路径的源地址中查找与文件格式相匹配的文件,生成文件传输列表;传输单元14,还设置为根据查找单元13生成的文件传输列表依次将源地址中的相应文件传输至目标地址中。本实施例中的校验模式的状态还包括开启或关闭文件处理,用于在传输单元14执行文件传输后的进一步处理,可选地,本实施例中的判断单元15,还设置为在传输单元14将文件传输至目标地址中之后,判断校验模式的状态;相应地,文件传输装置还包括:处理单元16,设置为在判断单元15判断出检验模式为开启文件处理时,将索引校验文件存储至目标地址所属网络端的预设路径中;处理单元16,还设置为在判断单元15判断出检验模式为关闭文件处理时,删除索引校验文件或者将索引校验文件存储至源地址网络端的预设路径中。In another possible implementation manner of the embodiment of the present invention, the configuration file further includes a verification mode, and the transmission rule corresponding to the file transmission path may also be an index verification file, where the file transmission device shown in FIG. 6 is used. In addition, the transmission module 12 further includes: a determining unit 15 configured to determine a state of the verification mode; correspondingly, the searching unit 13 is further configured to be in the file transmission path when the determining unit 15 determines that the verification mode is the opening check Searching for a file corresponding to the index entry of the index check file in the source address; the transfer unit 14 is further configured to transfer the file found by the search unit 13 to the target address; in addition, if the transfer rule in the embodiment further includes the file Format, search unit 13, It is further configured to: when the determining unit 15 determines that the verification mode is the off-check, search for a file matching the file format in the source address of the file transmission path to generate a file transfer list; and the transmission unit 14 is further configured to be based on the search unit 13 The generated file transfer list in turn transfers the corresponding file in the source address to the destination address. The state of the verification mode in this embodiment further includes the opening or closing of the file processing for further processing after the file transmission is performed by the transmission unit 14. Optionally, the determining unit 15 in the embodiment is further configured to transmit After the unit 14 transfers the file to the target address, the status of the verification mode is determined. Accordingly, the file transmission apparatus further includes: a processing unit 16 configured to set the index when the determination unit 15 determines that the verification mode is the open file processing. The verification file is stored in the preset path of the network end to which the target address belongs; the processing unit 16 is further configured to delete the index verification file or store the index verification file to the source address when the determination unit 15 determines that the verification mode is the closed file processing. In the default path of the network.
需要说明的是,本实施例中通过索引检验文件查找待传输文件与上述实施例中通过文件格式查找待传输文件,为本发明实施例提供的两种可选地查找文件的方式,通常为择一选择的,在本实施例的另一种可能的实现方式中,若传输规则中并没有文件格式,并且上述判断单元15判断出检验模式为关闭检验时,传输文件的方式还可以为:传输单元14,还设置为在判断单元15判断出检验模式为关闭检验时,将文件传输路径的源地址中的所有文件传输至目标地址中。It should be noted that, in this embodiment, the file to be transmitted is searched by the index check file, and the file to be transmitted is searched for by the file format in the foregoing embodiment, which is a method for separately searching for a file provided by the embodiment of the present invention. Alternatively, in another possible implementation manner of this embodiment, if there is no file format in the transmission rule, and the determining unit 15 determines that the verification mode is the shutdown check, the method for transmitting the file may also be: The unit 14 is further arranged to transmit all the files in the source address of the file transfer path to the target address when the judging unit 15 judges that the check mode is the close check.
本发明实施例提供的文件传输装置用于执行本发明图3所示实施例提供的文件传输方法,具备相应的功能模块,其实现原理和技术效果类似,此处不再赘述。The file transmission device provided by the embodiment of the present invention is used to execute the file transmission method provided by the embodiment shown in FIG. 3 of the present invention, and has a corresponding function module, and the implementation principle and the technical effect thereof are similar, and details are not described herein again.
可选地,在本发明上述各实施例的基础上,本实施例中的配置文件还可以包括压缩模式,以在图6所示文件传输装置的结构基础上为例予以说明,判断单元15,还设置为判断压缩模式的状态;相应地,本实施例中的传输模块12还包括:压缩单元17,设置为在判断单元15的判断出压缩模式为开启时,对查找到的文件进行压缩处理;传输模块12,还设置为将压缩单元17压缩处理后的文件传输至目标地址中;另外,传输模块12,还设置为在判断单元15的判断出压缩模式为关闭时,将查找到的文件传输至目标地址中。Optionally, on the basis of the foregoing embodiments of the present invention, the configuration file in this embodiment may further include a compression mode, which is illustrated by using the structure of the file transmission apparatus shown in FIG. The transmission module 12 in this embodiment is further configured to: a compression unit 17 configured to compress the found file when the determination unit 15 determines that the compression mode is on. The transmission module 12 is further configured to transmit the compressed file of the compression unit 17 to the target address; in addition, the transmission module 12 is further configured to: when the determination unit 15 determines that the compression mode is off, the file to be found Transfer to the destination address.
可选地,本实施例中的配置文件还可以包括备份模式和备份地址,则图 6所示文件传输装置中的判断单元15,还设置为在传输模块12将查找到的文件传输至目标地址中之后,判断备份模式的状态;相应地,处理单元16,还设置为在判断单元15判断出备份模式为关闭时,删除文件传输路径的源地址中已传输至目标地址的文件;另外,处理单元16,还设置为在判断单元15判断出备份模式为开启时,将文件传输路径的源地址中已传输至目标地址的文件备份到源地址所属网络端的备份地址中。Optionally, the configuration file in this embodiment may further include a backup mode and a backup address, and then The judging unit 15 in the file transfer device shown in FIG. 6 is further configured to determine the state of the backup mode after the transfer module 12 transfers the found file to the target address; accordingly, the processing unit 16 is further configured to be in the judging unit. When it is determined that the backup mode is off, the file that has been transferred to the target address in the source address of the file transfer path is deleted; and the processing unit 16 is further configured to set the file transfer path when the determining unit 15 determines that the backup mode is on. The file whose source address has been transferred to the destination address is backed up to the backup address of the network side to which the source address belongs.
本发明实施例提供的文件传输装置用于执行本发明图4所示实施例提供的文件传输方法,具备相应的功能模块,其实现原理和技术效果类似,此处不再赘述。The file transmission device provided by the embodiment of the present invention is used to execute the file transmission method provided by the embodiment shown in FIG. 4 of the present invention, and has a corresponding function module, and the implementation principle and the technical effect thereof are similar, and details are not described herein again.
具体实现中,在本发明上述各实施例的基础上,若需要传输的文件为动态的数据流,例如将语音通话或视频会议的内容传输至HDFS中,则需要重复执行文件传输的操作,并且可以通过用户的设置改变当前的配置文件,本实施例中的配置文件中还包括扫描时间间隔,则在上述图6所示文件传输装置的结构基础上,本实施例提供的装置还包括:计时模块18,设置为在传输模块12将查找到的文件传输至目标地址中之后,执行计时操作;相应地,传输模块12,还设置为在计时模块18的计时时间达到扫描时间间隔计时,根据当前配置文件在当前配置文件的文件传输路径的源地址中查找符合当前配置文件的传输规则的文件,并将所查找到文件传输至当前配置文件的文件传输路径的目标地址中。In a specific implementation, on the basis of the foregoing embodiments of the present invention, if the file to be transmitted is a dynamic data stream, for example, the content of the voice call or the video conference is transmitted to the HDFS, the file transfer operation needs to be repeatedly performed, and The current configuration file can be changed by the setting of the user. The configuration file in this embodiment further includes a scanning time interval. Based on the structure of the file transmission device shown in FIG. 6 , the device provided in this embodiment further includes: The module 18 is configured to perform a timing operation after the transmission module 12 transmits the found file to the target address; correspondingly, the transmission module 12 is further configured to reach the scanning time interval timing at the timing of the timing module 18, according to the current The configuration file searches for a file conforming to the transmission rule of the current configuration file in the source address of the file transfer path of the current configuration file, and transfers the found file to the destination address of the file transfer path of the current configuration file.
本领域普通技术人员可以理解上述方法中的全部或部分步骤可通过程序来指令相关硬件(例如处理器)完成,所述程序可以存储于计算机可读存储介质中,如只读存储器、磁盘或光盘等。可选地,上述实施例的全部或部分步骤也可以使用一个或多个集成电路来实现。相应地,上述实施例中的各模块/单元可以采用硬件的形式实现,例如通过集成电路来实现其相应功能,也可以采用软件功能模块的形式实现,例如通过处理器执行存储于存储器中的程序/指令来实现其相应功能。本发明不限制于任何特定形式的硬件和软件的结合。 One of ordinary skill in the art will appreciate that all or a portion of the above steps may be performed by a program to instruct related hardware, such as a processor, which may be stored in a computer readable storage medium, such as a read only memory, disk or optical disk. Wait. Alternatively, all or part of the steps of the above embodiments may also be implemented using one or more integrated circuits. Correspondingly, each module/unit in the above embodiment may be implemented in the form of hardware, for example, by implementing an integrated circuit to implement its corresponding function, or may be implemented in the form of a software function module, for example, executing a program stored in the memory by a processor. / instruction to achieve its corresponding function. The invention is not limited to any specific form of combination of hardware and software.
虽然本发明所揭露的实施方式如上,但所述的内容仅为便于理解本发明而采用的实施方式,并非用以限定本发明。任何本发明所属领域内的技术人员,在不脱离本发明所揭露的精神和范围的前提下,可以在实施的形式及细节上进行任何的修改与变化,但本发明的专利保护范围,仍须以所附的权利要求书所界定的范围为准。While the embodiments of the present invention have been described above, the described embodiments are merely for the purpose of understanding the invention and are not intended to limit the invention. Any modification and variation in the form and details of the embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention. The scope defined by the appended claims shall prevail.
工业实用性Industrial applicability
上述技术方案实现了仅通过用户设置的配置文件就可以与Hadoop系统进行海量文件的批量传输,并且传输的多个文件可以为格式化数据文件和/或非格式化数据文件;解决了在与Hadoop系统进行文件传输的方式中,文件传输的操作方式复杂,并且效率较低的问题。 The above technical solution realizes mass transfer of massive files with the Hadoop system only by the configuration file set by the user, and the transferred multiple files can be formatted data files and/or unformatted data files; solved in Hadoop with Hadoop In the way that the system performs file transfer, the file transfer operation mode is complicated and the efficiency is low.

Claims (17)

  1. 一种文件传输方法,包括:A file transfer method includes:
    根据用户的输入生成配置文件,所述配置文件中包括文件传输路径和所述文件传输路径对应的传输规则,其中,所述文件传输路径包括待传输文件的源地址和目标地址;And generating, by the user input, a configuration file, where the configuration file includes a file transmission path and a transmission rule corresponding to the file transmission path, where the file transmission path includes a source address and a target address of the file to be transmitted;
    根据所述配置文件,在所述文件传输路径的源地址中查找符合所述传输规则的文件,并将所查找到的文件传输至所述目标地址中,所传输的文件包括格式化数据文件和/或非格式化数据文件。Determining, according to the configuration file, a file that meets the transmission rule in a source address of the file transmission path, and transmitting the found file to the target address, where the transmitted file includes a formatted data file and / or unformatted data files.
  2. 根据权利要求1所述的文件传输方法,其中,所述传输规则包括文件格式,所述在所述文件传输路径的源地址中查找符合所述传输规则的文件,并将所查找到的文件传输至所述目标地址中,包括:The file transfer method according to claim 1, wherein the transfer rule includes a file format, the file in the source address of the file transfer path is searched for a file conforming to the transfer rule, and the found file is transferred To the target address, including:
    在所述文件传输路径的源地址中查找与所述文件格式相匹配的文件,并生成文件传输列表;Finding a file matching the file format in a source address of the file transmission path, and generating a file transfer list;
    根据所述文件传输列表依次将所述源地址中的相应文件传输至所述目标地址中。And correspondingly transferring the corresponding file in the source address to the target address according to the file transfer list.
  3. 根据权利要求2所述的文件传输方法,其中,The file transfer method according to claim 2, wherein
    所述文件传输路径包括第一路径和第二路径,所述第一路径和所述第二路径都包括至少一条所述文件传输路径,The file transmission path includes a first path and a second path, and the first path and the second path both include at least one file transmission path.
    所述在所述文件传输路径的源地址中查找与所述文件格式相匹配的文件,并生成文件传输列表;根据所述文件传输列表依次将所述源地址中的相应文件传输至所述目标地址中,包括:Searching, in a source address of the file transmission path, a file matching the file format, and generating a file transfer list; and sequentially transmitting a corresponding file in the source address to the target according to the file transfer list In the address, including:
    依次在所述第一路径和所述第二路径的源地址中查找与当前文件传输路径对应的文件格式相匹配的文件,生成当前文件传输列表,并根据所述当前文件传输列表依次将所述当前文件传输路径的源地址中的相应文件传输至所述目标地址中。And searching for a file matching the file format corresponding to the current file transmission path in the source addresses of the first path and the second path, generating a current file transfer list, and sequentially performing the according to the current file transfer list. A corresponding file in the source address of the current file transfer path is transferred to the target address.
  4. 根据权利要求1所述的文件传输方法,所述配置文件还包括校验模式,所述传输规则包括索引校验文件, The file transfer method according to claim 1, wherein the configuration file further comprises a check mode, the transfer rule including an index check file,
    所述在所述文件传输路径的源地址中查找符合所述传输规则的文件,并将所查找到的文件传输至所述目标地址中,包括:And searching for the file that meets the transmission rule in the source address of the file transmission path, and transmitting the found file to the target address, including:
    在所述校验模式为开启检验时,在所述文件传输路径的源地址中查找与所述索引校验文件的索引项对应的文件,并将所述查找到的文件传输至所述目标地址中;When the verification mode is an open check, searching for a file corresponding to an index entry of the index check file in a source address of the file transfer path, and transmitting the found file to the target address in;
    所述方法还包括:所述将所述文件传输至所述目标地址中之后,在所述校验模式为开启文件处理时,将所述索引校验文件存储至所述目标地址所属网络端的预设路径中;在所述校验模式为关闭文件处理时,删除所述索引校验文件或者将所述索引校验文件存储至所述源地址所述网络端的预设路径中。The method further includes: after the file is transferred to the target address, when the verification mode is open file processing, storing the index verification file to a pre-network to which the target address belongs In the path, when the verification mode is to close the file processing, the index verification file is deleted or the index verification file is stored in the preset path of the network end of the source address.
  5. 根据权利要求4所述的文件传输方法,所述传输规则还包括文件格式,The file transfer method according to claim 4, wherein the transfer rule further comprises a file format,
    所述在所述文件传输路径的源地址中查找符合所述传输规则的文件,并将所查找到的文件传输至所述目标地址中,还包括:The searching for the file that meets the transmission rule in the source address of the file transmission path, and transmitting the found file to the target address, further includes:
    在所述校验模式为关闭检验时,在所述文件传输路径的源地址中查找与所述文件格式相匹配的文件,生成文件传输列表,并根据所述文件传输列表依次将所述源地址中的相应文件传输至所述目标地址中。When the verification mode is a shutdown check, searching for a file matching the file format in a source address of the file transmission path, generating a file transfer list, and sequentially, according to the file transfer list, the source address The corresponding file in the file is transferred to the target address.
  6. 根据权利要求1~5中任一项所述的文件传输方法,所述配置文件还包括压缩模式,The file transfer method according to any one of claims 1 to 5, wherein the configuration file further includes a compressed mode,
    所述将所查找到的文件传输至所述目标地址,包括:Transmitting the found file to the target address, including:
    在所述压缩模式为开启时,对所述查找到的文件进行压缩处理,将压缩处理后的文件传输至所述目标地址中;When the compression mode is on, compressing the searched file, and transmitting the compressed file to the target address;
    在所述压缩模式为关闭时,将所述查找到的文件传输至所述目标地址中。The searched file is transferred to the target address when the compressed mode is off.
  7. 根据权利要求1~5中任一项所述的文件传输方法,所述配置文件还包括备份模式和备份地址,The file transfer method according to any one of claims 1 to 5, wherein the configuration file further includes a backup mode and a backup address.
    所述方法还包括:The method further includes:
    所述将所查找到的文件传输至所述目标地址中之后,在所述备份模式为关闭时,删除所述文件传输路径的源地址中已传输至所述目标地址的文件; After transmitting the found file to the target address, deleting the file that has been transmitted to the target address in the source address of the file transmission path when the backup mode is off;
    在所述备份模式为开启时,将所述文件传输路径的源地址中已传输至所述目标地址的文件备份到所述源地址所属网络端的备份地址中。When the backup mode is enabled, the file that has been transferred to the target address in the source address of the file transmission path is backed up to the backup address of the network end to which the source address belongs.
  8. 根据权利要求1~5中任一项所述的文件传输方法,所述配置文件还包括扫描时间间隔,The file transfer method according to any one of claims 1 to 5, wherein the configuration file further includes a scan time interval,
    所述方法还包括:The method further includes:
    所述在所述文件传输路径的源地址中查找符合所述传输规则的文件,并将所查找到的文件传输至所述目标地址中之后,计时时间达到所述扫描时间间隔时,根据当前配置文件在所述当前配置文件的文件传输路径的源地址中查找符合所述当前配置文件的传输规则的文件,并将所查找到的文件传输至所述当前配置文件的文件传输路径的目标地址中。After searching for a file that meets the transmission rule in the source address of the file transmission path, and transmitting the found file to the target address, when the timing time reaches the scanning time interval, according to the current configuration The file searches for a file conforming to the transmission rule of the current configuration file in a source address of a file transmission path of the current configuration file, and transmits the found file to a destination address of a file transmission path of the current configuration file. .
  9. 一种文件传输装置,包括:A file transfer device comprising:
    配置模块,设置为根据用户的输入生成配置文件,所述配置文件中包括文件传输路径和所述文件传输路径对应的传输规则,其中,所述文件传输路径包括待传输文件的源地址和目标地址;a configuration module, configured to generate a configuration file according to a user input, where the configuration file includes a file transmission path and a transmission rule corresponding to the file transmission path, where the file transmission path includes a source address and a destination address of the file to be transmitted ;
    传输模块,设置为根据所述配置模块生成的配置文件,在所述文件传输路径的源地址中查找符合所述传输规则的文件,并将所查找到的文件传输至所述目标地址中,所传输的文件包括格式化数据文件和/或非格式化数据文件。a transmission module, configured to search for a file that meets the transmission rule in a source address of the file transmission path according to a configuration file generated by the configuration module, and transmit the found file to the target address, where The transferred files include formatted data files and/or unformatted data files.
  10. 根据权利要求9所述的文件传输装置,其中,所述传输规则包括文件格式,The file transfer device according to claim 9, wherein said transfer rule includes a file format,
    所述传输模块包括:The transmission module includes:
    查找单元,设置为在所述文件传输路径的源地址中查找与所述文件格式相匹配的文件,并生成文件传输列表;a searching unit configured to search for a file matching the file format in a source address of the file transmission path, and generate a file transfer list;
    传输单元,设置为根据所述查找单元生成的文件传输列表依次将所述源地址中的相应文件传输至所述目标地址中。And a transmitting unit, configured to sequentially transmit the corresponding file in the source address to the target address according to the file transfer list generated by the searching unit.
  11. 根据权利要求10所述的文件传输装置,其中,所述文件传输路径包括第一路径和第二路径,所述第一路径和所述第二路径都包括至少一条所述文件传输路径, The file transfer device according to claim 10, wherein said file transfer path comprises a first path and a second path, said first path and said second path each comprising at least one of said file transfer paths,
    所述查找单元,是设置为依次在所述第一路径和所述第二路径的源地址中查找与当前文件传输路径对应的文件格式相匹配的文件,生成当前文件传输列表;The searching unit is configured to sequentially search for a file matching the file format corresponding to the current file transmission path in the source addresses of the first path and the second path, and generate a current file transmission list;
    传输单元,是设置为根据所述查找单元生成的当前文件传输列表依次将所述当前文件传输路径的源地址中的相应文件传输至所述目标地址中。And the transmitting unit is configured to sequentially transmit the corresponding file in the source address of the current file transmission path to the target address according to the current file transmission list generated by the searching unit.
  12. 根据权利要求9所述的文件传输装置,所述配置文件还包括校验模式,所述传输规则包括索引校验文件,The file transfer apparatus according to claim 9, wherein said configuration file further comprises a check mode, said transfer rule including an index check file,
    所述传输模块包括:The transmission module includes:
    判断单元,设置为判断所述校验模式的状态;a determining unit, configured to determine a state of the verification mode;
    查找单元,设置为在所述判断单元判断出所述校验模式为开启校验时,在所述文件传输路径的源地址中查找与所述索引校验文件的索引项对应的文件;a searching unit, configured to: when the determining unit determines that the verification mode is an open check, searching for a file corresponding to an index entry of the index check file in a source address of the file transfer path;
    传输单元,设置为将所述查找单元查找到的文件传输至所述目标地址中;a transmission unit, configured to transmit the file found by the searching unit to the target address;
    所述文件传输模块还包括:The file transfer module further includes:
    处理单元,设置为在将文件传输至所述目标地址中之后,在所述判断单元判断出所述校验模式为开启文件处理时,将所述索引校验文件存储至所述目标地址所属网络端的预设路径中;在所述判断单元判断出所述校验模式为关闭文件处理时,删除所述索引校验文件或者将所述索引校验文件存储至所述源地址所述网络端的预设路径中。a processing unit, configured to: after the determining unit determines that the verification mode is open file processing, after the determining unit transmits the file to the target address, storing the index verification file to the network to which the target address belongs In the preset path of the end; when the determining unit determines that the verification mode is closed file processing, deleting the index check file or storing the index check file to the network terminal Set the path.
  13. 根据权利要求12所述的文件传输装置,所述传输规则还包括文件格式,The file transfer device according to claim 12, wherein said transfer rule further comprises a file format,
    所述查找单元,还设置为在所述判断单元判断出所述校验模式为关闭检验时,在所述文件传输路径的源地址中查找与所述文件格式相匹配的文件,生成文件传输列表。The searching unit is further configured to: when the determining unit determines that the verification mode is a shutdown check, search for a file matching the file format in a source address of the file transmission path, and generate a file transfer list. .
  14. 根据权利要求9所述的文件传输装置,所述配置文件还包括压缩模式,The file transfer device of claim 9, the configuration file further comprising a compressed mode,
    所述传输模块包括:判断单元、压缩单元和传输单元; The transmission module includes: a determining unit, a compression unit, and a transmission unit;
    所述判断单元,设置为判断所述压缩模式的状态;The determining unit is configured to determine a state of the compressed mode;
    所述压缩单元,设置为在所述判断单元的判断出所述压缩模式为开启时,对所述查找到的文件进行压缩处理;The compression unit is configured to perform compression processing on the found file when the determining unit determines that the compression mode is on;
    所述传输单元,设置为将所述压缩单元压缩处理后的文件传输至所述目标地址中;在所述判断单元的判断出所述压缩模式为关闭时,将所述查找到的文件传输至所述目标地址中。The transmission unit is configured to transmit the compressed file of the compression unit to the target address; and when the determining unit determines that the compression mode is off, transmit the found file to In the target address.
  15. 根据权利要求10所述的文件传输装置,所述配置文件还包括备份模式和备份地址,The file transfer device according to claim 10, wherein said configuration file further includes a backup mode and a backup address.
    所述传输模块还包括:判断单元和处理单元;The transmission module further includes: a determining unit and a processing unit;
    所述判断单元,设置为在所述传输单元将所述查找到的文件传输至所述目标地址中之后,判断所述备份模式的状态;The determining unit is configured to determine a state of the backup mode after the transmitting unit transmits the found file to the target address;
    所述处理单元,设置为在所述判断单元判断出所述备份模式为关闭时,删除所述文件传输路径的源地址中已传输至所述目标地址的文件;在所述判断单元判断出所述备份模式为开启时,将所述文件传输路径的源地址中已传输至所述目标地址的文件备份到所述源地址所属网络端的备份地址中。The processing unit is configured to, when the determining unit determines that the backup mode is off, delete a file that has been transmitted to the target address in a source address of the file transmission path; and determine, by the determining unit When the backup mode is enabled, the file that has been transferred to the target address in the source address of the file transmission path is backed up to the backup address of the network end to which the source address belongs.
  16. 根据权利要求9~13中任一项所述的文件传输装置,所述配置文件还包括扫描时间间隔,The file transfer device according to any one of claims 9 to 13, wherein the configuration file further includes a scan time interval,
    所述文件传输装置还包括:The file transfer device further includes:
    计时模块,设置为在所述传输模块将所查找到的文件传输至所述目标地址中之后,执行计时操作;a timing module, configured to perform a timing operation after the transmission module transmits the found file to the target address;
    所述传输模块,还设置为在所述计时模块的计时时间达到所述扫描时间间隔计时,根据当前配置文件在所述当前配置文件的文件传输路径的源地址中查找符合所述当前配置文件的传输规则的文件,并将所查找到的文件传输至所述当前配置文件的文件传输路径的目标地址中。The transmission module is further configured to: when the timing time of the timing module reaches the scanning time interval, searching for the current configuration file according to the current configuration file in the source address of the file transmission path of the current configuration file. Transfer the file of the rule and transfer the found file to the destination address of the file transfer path of the current profile.
  17. 一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,所述计算机可执行指令用于执行权利要求1~8中任一项所述的方法。 A computer storage medium having stored therein computer executable instructions for performing the method of any one of claims 1-8.
PCT/CN2016/074278 2015-08-07 2016-02-22 File transfer method and apparatus WO2016165482A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510483928.4A CN106445951B (en) 2015-08-07 2015-08-07 File transmission method and device
CN201510483928.4 2015-08-07

Publications (1)

Publication Number Publication Date
WO2016165482A1 true WO2016165482A1 (en) 2016-10-20

Family

ID=57126960

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/074278 WO2016165482A1 (en) 2015-08-07 2016-02-22 File transfer method and apparatus

Country Status (2)

Country Link
CN (1) CN106445951B (en)
WO (1) WO2016165482A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112035407A (en) * 2020-09-01 2020-12-04 武汉虹旭信息技术有限责任公司 File transmission system and transmission method
CN114827129A (en) * 2022-04-27 2022-07-29 济南浪潮数据技术有限公司 File distribution method, device and medium

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107832352B (en) * 2017-10-23 2022-01-04 中国银行股份有限公司 Log automatic processing method, device, storage medium and equipment
CN108268346B (en) * 2018-02-13 2021-03-30 苏州龙信信息科技有限公司 Data backup method, device, equipment and storage medium
CN109144946A (en) * 2018-07-24 2019-01-04 中国建设银行股份有限公司 A kind of document handling method and device
CN109697189A (en) * 2018-12-14 2019-04-30 北京锐安科技有限公司 A kind of processing method of data file, device, storage medium and electronic equipment
CN109889588B (en) * 2019-02-13 2021-10-29 中国银行股份有限公司 File acquisition method and device, computer equipment and storage medium
CN110830567A (en) * 2019-10-31 2020-02-21 中国银行股份有限公司 Data transmission method and device
CN111367857B (en) * 2020-03-03 2023-06-16 中国联合网络通信集团有限公司 Data storage method and device, FTP server and storage medium
CN111625449A (en) * 2020-05-08 2020-09-04 苏州浪潮智能科技有限公司 File filtering rule testing method, device, equipment and readable storage medium
CN111586187A (en) * 2020-05-12 2020-08-25 甬矽电子(宁波)股份有限公司 Data transmission method, device, application server and data transmission system
CN112632023A (en) * 2020-12-30 2021-04-09 北京浩瀚深度信息技术股份有限公司 File transmission method and device written by multiple data sources and storage medium
CN112765605A (en) * 2020-12-31 2021-05-07 浙江中控技术股份有限公司 Data processing method and related equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101669329A (en) * 2007-04-30 2010-03-10 惠普发展公司,有限责任合伙企业 Autofile shifts
CN102222112A (en) * 2011-07-12 2011-10-19 宇龙计算机通信科技(深圳)有限公司 Resource management device and resource management method
CN103002032A (en) * 2012-12-03 2013-03-27 北京奇虎科技有限公司 Bidirectional data transmission method and device
CN103677673A (en) * 2013-12-23 2014-03-26 Tcl集团股份有限公司 Method and system for uploading files in classifying and batching mode
CN103685413A (en) * 2012-09-19 2014-03-26 腾讯科技(深圳)有限公司 File uploading method and system

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2864397A1 (en) * 2003-12-23 2005-06-24 France Telecom Digital candidate file transmission temporal management method, involves finding mailing date by taking into account expiration date, and sending file using successive putbacks until file has been completely sent before expiration date
CN100471173C (en) * 2006-03-11 2009-03-18 华为技术有限公司 Data report transmitting system and method
CN101335763B (en) * 2007-06-28 2013-05-29 中兴通讯股份有限公司 Data educing and transmitting method in intelligent network system
CN101841463B (en) * 2010-03-05 2012-05-16 清华大学 Multipath cocurrent transmission method based on SCTP (Stream Control Transmission Protocol)
CN102419770B (en) * 2011-11-23 2014-12-31 中兴通讯股份有限公司 File sharing system, method for realizing file sharing, and file index service equipment
US10367878B2 (en) * 2012-03-31 2019-07-30 Bmc Software, Inc. Optimization of path selection for transfers of files
CN102984244A (en) * 2012-11-21 2013-03-20 用友软件股份有限公司 Uploading system and uploading method of bill data
CN103873517B (en) * 2012-12-14 2017-07-14 中兴通讯股份有限公司 A kind of methods, devices and systems of data syn-chronization
CN104424225B (en) * 2013-08-26 2018-08-31 联想(北京)有限公司 Document handling method based on document transmission process and device
CN104092719B (en) * 2013-12-17 2015-10-07 深圳市腾讯计算机系统有限公司 Document transmission method, device and distributed cluster file system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101669329A (en) * 2007-04-30 2010-03-10 惠普发展公司,有限责任合伙企业 Autofile shifts
CN102222112A (en) * 2011-07-12 2011-10-19 宇龙计算机通信科技(深圳)有限公司 Resource management device and resource management method
CN103685413A (en) * 2012-09-19 2014-03-26 腾讯科技(深圳)有限公司 File uploading method and system
CN103002032A (en) * 2012-12-03 2013-03-27 北京奇虎科技有限公司 Bidirectional data transmission method and device
CN103677673A (en) * 2013-12-23 2014-03-26 Tcl集团股份有限公司 Method and system for uploading files in classifying and batching mode

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112035407A (en) * 2020-09-01 2020-12-04 武汉虹旭信息技术有限责任公司 File transmission system and transmission method
CN112035407B (en) * 2020-09-01 2023-10-31 武汉虹旭信息技术有限责任公司 File transmission system and transmission method
CN114827129A (en) * 2022-04-27 2022-07-29 济南浪潮数据技术有限公司 File distribution method, device and medium

Also Published As

Publication number Publication date
CN106445951A (en) 2017-02-22
CN106445951B (en) 2022-05-17

Similar Documents

Publication Publication Date Title
WO2016165482A1 (en) File transfer method and apparatus
US8706703B2 (en) Efficient file system object-based deduplication
US9996549B2 (en) Method to construct a file system based on aggregated metadata from disparate sources
US20150058289A1 (en) Facilitating data migration between database clusters while the database continues operating
CN106506587A (en) A kind of Docker image download methods based on distributed storage
US11640341B1 (en) Data recovery in a multi-pipeline data forwarder
US9900386B2 (en) Provisioning data to distributed computing systems
US11991094B2 (en) Metadata driven static determination of controller availability
US11714783B2 (en) Browsable data and data retrieval from a data archived image
US11182378B2 (en) System and method for committing and rolling back database requests
TW201738781A (en) Method and device for joining tables
US12099886B2 (en) Techniques for performing clipboard-to-file paste operations
US11307984B2 (en) Optimized sorting of variable-length records
US11157456B2 (en) Replication of data in a distributed file system using an arbiter
CN110168513A (en) The part of big file is stored in different storage systems
US20150074351A1 (en) Write-behind caching in distributed file systems
US20220036206A1 (en) Containerized distributed rules engine
WO2024021808A1 (en) Data query request processing method and apparatus, device and storage medium
US20150154219A1 (en) Computing resource provisioning based on deduplication
CA2722511C (en) Efficient change tracking of transcoded copies
US11379147B2 (en) Method, device, and computer program product for managing storage system
US20140115370A1 (en) Electronic device and method for reducing energy consumption of storage devices
CN116490865A (en) Efficient bulk loading of multiple rows or partitions for a single target table
CN115994140A (en) Distributed system, data processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16779457

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16779457

Country of ref document: EP

Kind code of ref document: A1