CN114640665A - Multi-source segmented parallel file downloading method and tool - Google Patents

Multi-source segmented parallel file downloading method and tool Download PDF

Info

Publication number
CN114640665A
CN114640665A CN202210137400.1A CN202210137400A CN114640665A CN 114640665 A CN114640665 A CN 114640665A CN 202210137400 A CN202210137400 A CN 202210137400A CN 114640665 A CN114640665 A CN 114640665A
Authority
CN
China
Prior art keywords
downloading
download
speed
queue
link
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210137400.1A
Other languages
Chinese (zh)
Other versions
CN114640665B (en
Inventor
孙加铱
郭文明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN202210137400.1A priority Critical patent/CN114640665B/en
Publication of CN114640665A publication Critical patent/CN114640665A/en
Application granted granted Critical
Publication of CN114640665B publication Critical patent/CN114640665B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1074Peer-to-peer [P2P] networks for supporting data block transmission mechanisms
    • H04L67/1078Resource delivery mechanisms
    • H04L67/108Resource delivery mechanisms characterised by resources being split in blocks or fragments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1074Peer-to-peer [P2P] networks for supporting data block transmission mechanisms
    • H04L67/1078Resource delivery mechanisms
    • H04L67/1085Resource delivery mechanisms involving dynamic management of active down- or uploading connections

Abstract

The embodiment of the invention provides a multi-source subsection parallel file downloading method and tool, wherein the method comprises the following steps: the method comprises the following steps: performing MD5 verification on the acquired download links, and classifying the download links with the same MD5 value into the same download queue; step two: acquiring the queue downloading speed of each downloading queue, and selecting the downloading queue with the highest speed to perform segmented parallel downloading of the files; step three: dividing the file of the download queue into data blocks with different sizes according to the speed of each download link in the download queue in proportion, distributing threads to the current download queue according to the number of the data blocks, wherein each thread is responsible for the corresponding data block and the download link, and downloading in parallel in multiple threads; step four: and after downloading, assembling the data blocks and combining the data blocks into a unified file. The invention fully utilizes the network bandwidth, effectively reduces the downloading time, ensures the reliability of downloading transmission, improves the downloading success rate and ensures the downloading experience of the user.

Description

Multi-source segmented parallel file downloading method and tool
Technical Field
The invention relates to the technical field of data downloading of the Internet, in particular to a multi-source segmented parallel downloading method and tool.
Background
Currently, with the wide popularization of personal computers across the country and the intelligentized trend of offices, the demand of people for downloading and transmitting files is gradually increasing. Under the environment, services such as online and offline file downloading, cloud disk storage, cloud storage and the like on the Internet are developed at a high speed, the downloading and transmission convenience of the files is greatly enhanced, and a great deal of convenience is brought to people. With the gradual increase of the volume of the transmission files and the gradual rise of the time cost, the requirement of people on the downloading speed is gradually increased. However, most websites often limit the file downloading speed at present, the transmission stability cannot be guaranteed, the situation that the downloading speed drops suddenly from half of the downloading speed often occurs in some websites, the disaster-resistant risk does not exist, and the downloading time cost is greatly improved. Even for cloud disk products purchased for a fee, the download speed is usually not limited by the network bandwidth, and the paid download service is sometimes only unsatisfactory. For some office and school personnel, files to be downloaded may exist in both FTP and HTTP environments, and cannot be simultaneously transmitted by utilizing the existing bandwidth. The existing downloading technology can only download files from one downloading link, the downloading rate is limited by a downloading source, and with the popularization of the 5G technology, the increase of network bandwidth and the great acceleration of life rhythm, the defects and shortcomings are gradually revealed. However, multiple download sources may exist for the same download file on the internet, and the advantages and disadvantages of different download sources are often different, so that how to select a download source becomes a difficult problem.
Disclosure of Invention
In view of this, embodiments of the present invention provide a multi-source segmented parallel file downloading method and tool, which can effectively improve downloading speed and downloading success rate.
The invention discloses a multi-source subsection parallel file downloading method, which comprises the following steps:
the method comprises the following steps: performing MD5 verification on the acquired download links, and classifying the download links with the same MD5 value into the same download queue;
step two: acquiring the queue downloading speed of each downloading queue, and selecting the downloading queue with the highest speed to perform segmented parallel downloading of the files;
step three: dividing the file of the download queue into data blocks with different sizes according to the speed of each download link in the download queue in proportion, distributing threads to the current download queue according to the number of the data blocks, wherein each thread is responsible for the corresponding data block and the download link, and downloading in parallel in multiple threads;
step four: and after downloading, assembling the data blocks and combining the data blocks into a unified file.
Further, the first step is also preceded by: acquiring a download link through a keyword matching acquisition mode and/or a manual acquisition mode;
wherein, obtaining the download link through the keyword matching obtaining mode includes:
collecting keywords of a target file to be downloaded;
and collecting the downloading links in the existing downloading resource pool according to the keywords.
Further, the first step is also preceded by:
presetting a downloading environment, protocols, and highest downloading speed and lowest downloading speed under each protocol;
the first step further comprises the following steps:
carrying out validity test on the download link of each download queue to obtain the validity of each download link;
predicting the future downloading speed of the downloading link of each downloading queue through a downloading speed prediction model to obtain the future average downloading speed of each downloading link;
labeling a protocol source for the download link in each download queue, wherein the protocol source is used for reflecting a protocol corresponding to the download link;
the second step comprises the following steps:
accumulating the future average download speeds of the download links with the same protocol source in each download queue to obtain the accumulated download speed corresponding to each protocol;
judging whether the accumulated downloading speed exceeds the highest downloading speed of the corresponding protocol, if so, taking the highest downloading speed of the corresponding protocol as the actual downloading speed, and if not, taking the accumulated downloading speed as the actual downloading speed;
calculating the sum of actual downloading speeds, and determining the fastest downloading speed;
if the fastest downloading speed is respectively compared with the network bandwidth and the disk reading-writing speed in the downloading environment, and if the fastest downloading speed is less than the network bandwidth and the disk reading-writing speed, the fastest downloading speed is taken as the queue downloading speed; if the fastest downloading speed is greater than the network bandwidth or the disk reading-writing speed, taking the network bandwidth or the disk reading-writing speed as the corresponding queuing downloading speed; if the fastest download speed is higher than the network bandwidth and the disk read-write speed, the lowest value of the two is taken as the queue download speed.
Further, the second step further includes: calculating the sum of the accumulated downloading speeds to obtain the ideal fastest downloading speed;
the third step comprises:
distributing data blocks for the download links according to the adding sequence, wherein the data blocks distributed by each download link are as follows: download link download speed/ideally fastest download speed 100%;
if only one download queue is used for downloading, all available download threads are distributed to the download queue; if a plurality of download queues are used for downloading, dividing the threads into each download queue in a halving mode by default, modifying default settings to distribute any number of threads to the download queues, and distributing at least one thread to each download queue;
for each data block in any download queue, if the number of the data blocks is less than or equal to the thread number of the download queue, allocating one thread for each data block to carry out downloading, and only one download link in one thread carries out downloading work; if the number of the data blocks is larger than the number of the threads, the threads and the corresponding data blocks are distributed to the data blocks in sequence from high to low according to the downloading speed of the downloading links, the downloading links which are not distributed to the threads are added into the standby queue, and the data blocks which are not distributed are all classified into the threads where the downloading links with the highest downloading speed are located for downloading;
dynamically distributing data blocks in each thread in the downloading process, calling the downloading link out of the thread after one thread finishes downloading the data blocks, calling a downloading speed prediction model to predict the downloading link and the future average downloading speed of all the downloading links in the standby queue again, selecting the downloading link with the highest downloading speed to join the thread again, calculating the residual downloading time of the rest threads, selecting the thread with the longest residual downloading time, and proportionally distributing the residual data blocks of the thread according to the downloading speed.
Further, the third step is followed by: monitoring the abnormality in the downloading process, prompting a user, and processing the abnormality; wherein the anomaly comprises one or more of: unknown errors, single download link overtime, all download link overtime, file I/O errors, resource loss or no resources found, network connection failure, too slow download speed, insufficient available disk space;
the fourth step further comprises:
and after the file is downloaded, storing the index file and the download queue of the downloaded file.
The invention also discloses a multi-source subsection parallel file downloading tool, which comprises:
a download distribution module to:
performing MD5 verification on the obtained download links, and classifying the download links with the same MD5 value into the same download queue;
acquiring the queue downloading speed of each downloading queue, and selecting the downloading queue with the highest speed to perform segmented parallel downloading of the files;
dividing the file of the download queue into data blocks with different sizes according to the speed of each download link in the download queue in proportion, and distributing threads to the current download queue according to the number of the data blocks, wherein each thread is used for being responsible for the corresponding data block and the download link and carrying out multi-thread parallel download;
the download monitoring module is used for downloading files;
and the download completion module is used for assembling the data blocks after the download is completed and combining the data blocks into a unified file.
Further, the file download tool further comprises: the system comprises a BitTorrent file downloading module, a keyword downloading module and/or a manual acquiring module, wherein the keyword downloading module is used for:
collecting keywords of a file to be downloaded; collecting downloading links in an existing downloading resource pool according to the keywords;
a manual acquisition module to:
collecting one or more download link addresses input by a user, or identifying a txt file uploaded by the user, extracting an identifiable download link from the txt file by analyzing the txt file, and transmitting the collected download link to a download distribution module for assembling a download queue;
the BitTorrent file downloading module is used for collecting a remote BitTorrent file address input by a user or a local BitTorrent file for file downloading, or collecting a BitTorrent Magnet URI input by the user for magnetic link downloading.
Further, the file download tool further comprises: the download setting module is used for acquiring a download environment set by a user, wherein the download environment comprises: the download file storage position, the disk read-write speed of the download file storage position, the residual disk space of the download file storage position, HTTP environment download parameters, FTP environment download parameters, network bandwidth parameters and the maximum CPU thread number allocated to the download task;
the download distribution module is further configured to:
carrying out validity test on the download link of each download queue to obtain the validity of each download link;
predicting the future downloading speed of the downloading link of each downloading queue through a downloading speed prediction model to obtain the future average downloading speed of each downloading link, and taking the predicted future average downloading speed as the actual downloading speed of the downloading link;
labeling a protocol source for the download link in each download queue, wherein the protocol source is used for reflecting a protocol corresponding to the download link;
the "acquiring the queue downloading speed of each downloading queue" specifically includes:
accumulating the future average download speeds of the download links with the same protocol source in each download queue to obtain the accumulated download speed corresponding to each protocol;
judging whether the accumulated downloading speed exceeds the highest downloading speed of the corresponding protocol, if so, taking the highest downloading speed of the corresponding protocol as the actual downloading speed, and if not, taking the accumulated downloading speed as the actual downloading speed;
calculating the sum of actual downloading speeds, and determining the fastest downloading speed;
if the fastest downloading speed is respectively compared with the network bandwidth and the disk reading-writing speed in the downloading environment, and if the fastest downloading speed is less than the network bandwidth and the disk reading-writing speed, the fastest downloading speed is taken as the queue downloading speed; if the fastest downloading speed is greater than the network bandwidth or the disk reading-writing speed, taking the network bandwidth or the disk reading-writing speed as the corresponding queuing downloading speed; and if the fastest downloading speed is greater than the network bandwidth and the disk reading-writing speed, taking the lowest value of the two as the queue downloading speed.
Further, the download allocation module is specifically configured to:
distributing data blocks for the download links according to the adding sequence, wherein the data blocks distributed by each download link are as follows: download link download speed/ideally fastest download speed 100%;
if only one download queue is used for downloading, all available download threads are distributed to the download queue; if a plurality of download queues are used for downloading, dividing the threads into each download queue in a dividing mode by default, modifying default setting to distribute any number of threads to the download queues, and distributing at least one thread to each download queue;
for each data block in any download queue, if the number of the data blocks is less than or equal to the thread number of the download queue, allocating one thread for each data block to carry out downloading, and only one download link in one thread carries out downloading work; if the number of the data blocks is larger than the number of the threads, the threads and the corresponding data blocks are distributed to the data blocks in sequence from high to low according to the downloading speed of the downloading links, the downloading links which are not distributed to the threads are added into the standby queue, and the data blocks which are not distributed are all classified into the threads where the downloading links with the highest downloading speed are located for downloading;
dynamically distributing data blocks in each thread in the downloading process, calling the downloading link out of the thread after one thread finishes downloading the data blocks, calling a downloading speed prediction model to predict the downloading link and the future average downloading speed of all the downloading links in the standby queue again, selecting the downloading link with the highest downloading speed to join the thread again, calculating the residual downloading time of the rest threads, selecting the thread with the longest residual downloading time, and proportionally distributing the residual data blocks of the thread according to the downloading speed.
Further, the download monitoring module is further configured to: downloading the file, monitoring the abnormality in the downloading process, prompting a user, and processing the abnormality; wherein the anomaly comprises one or more of: unknown errors, overtime of a single download link, overtime of all download links, file I/O errors, resource loss or no resource found, network connection failure, too slow download speed and insufficient available disk space;
the download completion module is also used for storing the index file and the download queue of the downloaded file after the file download is completed.
The implementation of the embodiments of the present invention includes the following beneficial results:
the embodiment of the invention provides a multi-source segmented parallel file downloading method, which is characterized in that a downloading queue is packaged according to the MD5 value of a downloading link file, multi-source multi-thread segmented downloading is carried out on a file dividing data block, the network bandwidth is fully utilized, and the downloading time length is effectively reduced; subdividing the transmission speed of the HTTP protocol and the FTP protocol, and effectively monitoring the data flow; meanwhile, abnormal intelligent processing is carried out in the downloading process and after downloading is completed, data blocks from invalid links or low-speed links are automatically distributed to other normal links to complete downloading, reliability of downloading transmission is guaranteed, downloading success rate is improved, and downloading experience of a user is guaranteed.
The embodiment of the invention provides a multi-source segmented parallel file downloading tool, which is characterized in that a resource pool interface is added on the basis of an implementation method, the function of intelligently searching for a downloading link by inputting a keyword is realized by using keyword matching, and the complicated step of searching for the downloading link is omitted; meanwhile, the BitTorrent downloading function is newly added, and a downloading tool can be adapted to most of the current mainstream downloading methods.
Other advantageous effects of the present invention will be described in the detailed description.
Drawings
Fig. 1 is an overall flowchart of a multi-source segmented parallel file downloading method disclosed in the preferred embodiment of the present invention.
Fig. 2 is a schematic view of a specific flow of a multi-source segmentation parallel file downloading method disclosed in the preferred embodiment of the present invention.
Fig. 3 is a schematic view of a processing flow of a download exception according to the preferred embodiment of the present invention.
FIG. 4 is a block diagram of a multi-source segmented parallel file download tool system according to the preferred embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the figures and the specific embodiments. The specific embodiments described herein are merely illustrative of the invention and are not intended to be limiting. The step numbers in the following embodiments are provided only for convenience of illustration, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art. The process of the invention has the following examples.
Example one
As shown in fig. 1, the invention discloses a multi-source segmented parallel file downloading method, which comprises the following steps:
the method comprises the following steps: performing MD5 verification on the obtained download links, and classifying the download links with the same MD5 value into the same download queue;
step two: acquiring the queue downloading speed of each downloading queue, and selecting the downloading queue with the highest speed to perform segmented parallel downloading of the files;
step three: dividing files of a download queue into data blocks with different sizes according to the speed of each download link in the download queue in proportion, distributing threads to the current download queue according to the number of the data blocks, wherein each thread is responsible for the corresponding data block and the download link, and performing multi-thread parallel downloading;
step four: and after downloading, assembling the data blocks and combining the data blocks into a unified file.
This embodiment also discloses a parallel file downloading instrument of multisource segmentation, includes:
a download allocation module to: performing MD5 verification on the acquired download links, and classifying the download links with the same MD5 value into the same download queue; acquiring the queue downloading speed of each downloading queue, and selecting the downloading queue with the highest speed to perform segmented parallel downloading of the files; dividing the file of the download queue into data blocks with different sizes according to the speed of each download link in the download queue in proportion, and distributing threads to the current download queue according to the number of the data blocks, wherein each thread is used for being responsible for the corresponding data block and the download link and carrying out multi-thread parallel download;
the download monitoring module is used for downloading files;
and the download completion module is used for assembling the data blocks after the download is completed and combining the data blocks into a unified file.
Example two
The embodiment discloses a multi-source subsection parallel file downloading method, which comprises the following steps: collecting keywords of a target file to be downloaded; collecting downloading links in an existing downloading resource pool according to the keywords; performing MD5 verification on the collected download links, and enabling the download links with the same MD5 value to be classified into the same download queue; integrating and calculating each download queue according to a corresponding method according to the read-write speed of the disk, the bandwidth condition and the download speed of each download link, and selecting the download queue with the highest final download speed to perform segmented parallel downloading of the file; dividing files of the download queue into data blocks with different sizes according to the speed of each download link in the queue in proportion, distributing threads to the current download queue according to the number of the data blocks, wherein each thread is responsible for the corresponding data block and the download link, and performing multi-thread parallel downloading; and after downloading, assembling the data blocks and combining the data blocks into a unified file.
Preferably, the method disclosed in this embodiment further includes:
a downloading protocol needs to be preset before downloading;
before downloading, the highest downloading speed and the lowest downloading speed under each downloading protocol need to be preset;
after the download queues are established, the download link of each queue is tested;
after the download queues are established, the download links in each download queue can be deleted at will;
after the download queue is established, a new download link can be added, MD5 verification is carried out on the newly added download queue, the download queue can be automatically added into the download queue with the same MD5 value, and if the MD5 value of the download queue is different from that of all the queues, a download queue is newly established for the download queue;
when the download queues are selected, if the volume difference of the files downloaded by a certain two download queues exceeds 10%, the download queues need to be manually selected;
when the download queues are selected, if the final download speeds of the plurality of download queues are the same, the first download queue is selected by default, or the download queues can be selected manually;
when the download queue is selected, a plurality of download queues can be manually selected to be downloaded simultaneously;
during the downloading process of the downloading queue, the downloading status of each downloading link in the downloading queue needs to be monitored in real time, and if the downloading status is abnormal, an error prompt is reported and processed.
During and after the downloading process, the pre-generated downloading queue will not be emptied and will be stored in the history for standby or repeated downloading.
Preferably, the download protocol comprises: the HTTP protocol and the FTP protocol.
Preferably, in the process of checking MD5, if the check of MD5 fails, the current download link is discarded.
Preferably, the download link needs to be respectively marked as a download link of the HTTP source and a download link of the FTP source according to the download protocol.
Preferably, the strategy selected by the calculation is as follows: accumulating the downloading speeds of all the downloading links of the HTTP sources in the current downloading queue, and recording the accumulated sum as A; accumulating the download link download speeds of all FTP sources in the current download queue, and recording the accumulated sum as B; c is made to be A + B, and C is marked as the fastest downloading speed under an ideal condition; if A exceeds the set highest downloading speed of the HTTP protocol, D is the highest downloading speed of the HTTP protocol, otherwise D is A; if B exceeds the set maximum download speed of the FTP protocol, E is the maximum download speed of the FTP protocol, otherwise E is B; and F is made to be D + E, if F exceeds the network bandwidth, F is made to be the bandwidth, if F exceeds the disk read-write speed, F is made to be the disk read-write speed, and finally F is made to be the final download speed of the current download queue.
Optionally, the testing includes: and testing the validity of the link, checking whether effective network connection can be established with the downloading link, and abandoning the current downloading link if the network connection cannot be established.
The test comprises that the download link of each download queue is subjected to future download speed prediction through a download speed prediction model to obtain the future average download speed of each download link, and the predicted future average download speed is regarded as the download speed of the download link in actual application.
Correspondingly, this embodiment also discloses a parallel file downloading tool of multisource segmentation, includes: the system comprises a download setting module, a keyword download module, a download link download module, a BitTorrent file download module, a download distribution module, a download monitoring module and a download completion module.
The download setting module is used for acquiring the download environment set by the user, and comprises: the download file storage position, the disk read-write speed of the download file storage position, the residual disk space of the download file storage position, the HTTP environment download parameter, the FTP environment download parameter, the network bandwidth parameter and the maximum CPU thread number allocated to the download task.
The keyword downloading module is used for collecting keywords of the files to be downloaded input by the user, searching a specific number of downloading links by a keyword matching method, collecting the keywords of the files to be downloaded input by the user, searching a specific number of downloading links by a keyword matching algorithm, and transmitting the collected downloading links to the downloading distribution module for assembling the downloading queue. The keyword matching method comprises the following steps: the invention has the advantages of accurate matching and fuzzy matching, and the specific principle is not repeated.
The download link download module is used for collecting one or more download link addresses input by a user or identifying a txt file uploaded by the user, extracting an identifiable download link from the txt file through intelligent analysis, and transmitting the collected download link to the download distribution module for assembling a download queue.
The BitTorrent file downloading module is used for collecting a remote BitTorrent file address input by a user or a local BitTorrent file for file downloading, or collecting a BitTorrent Magnet URI input by the user for magnetic link downloading.
The download distribution module is used for carrying out MD5 verification on each collected download connection before downloading, assembling the download connections with the same MD5 value into download queues, testing the download speed of the download link of each download queue, dividing the file of each download queue into data blocks with different sizes according to the speed of each download link in the queue, distributing a certain number of threads to the current download queue according to the number of the data blocks, wherein each thread is responsible for the corresponding data blocks and the download links, and simultaneously operates in a multi-thread mode to download the files in parallel. The user needs to select a download queue to be downloaded in the download distribution module and transmit the download queue to the download monitoring module for downloading.
Preferably, the download monitoring module is configured to download the file, monitor an exception occurring during the downloading process, prompt the user, and process the exception.
The download completion module is used for storing the index file of the downloaded file after the file is downloaded, so that subsequent searching and deleting are facilitated, meanwhile, the download queue can be stored, and subsequent repeated downloading or sharing is facilitated.
Wherein the exception comprises: unknown errors, single download link timeout, all download link timeout, file I/O error, lost or unavailable resources, network connection failure, download speed too slow, and insufficient available disk space.
EXAMPLE III
As shown in fig. 2, an overall process of a multi-source segmented parallel file downloading method includes:
step S101, presetting a downloading environment, a downloading protocol and a highest downloading speed and a lowest downloading speed under each downloading protocol.
It should be noted that, in this embodiment, the download environment includes: the storage location of the download file and the maximum number of CPU threads allocated to the download task. Wherein, the maximum CPU thread number does not exceed the computer thread number. And simultaneously, automatically acquiring the disk read-write speed of the storage position of the downloaded file and the residual disk space of the storage position of the downloaded file.
It should be noted that, in this embodiment, the download protocol includes: the HTTP protocol and the FTP protocol. The FTP download is set by inputting parameters such as FTP host address, user name, password, port and the like. Meanwhile, the HTTP environment downloading parameter, the FTP environment downloading parameter and the network bandwidth parameter are automatically obtained.
It should be noted that, in this embodiment, the highest download speed and the lowest download speed under each download protocol need to be set by the user, where the highest download speed is used to limit the traffic transmission of the download protocol, and the lowest download speed is suitable for the discard logic determination, that is, the download queue that does not reach the lowest download speed is discarded.
In this embodiment, the download link obtaining manner may adopt two methods, one of which is that, referring to step S102, the download link is manually input. Specifically, the method for the user to manually input the download link may be to directly input the download link one by one, or to upload a file containing the download link in txt format or the like, and to extract an identifiable download link from the file through intelligent analysis.
Secondly, referring to step S103, a keyword is input, and a download link is intelligently collected according to the keyword. Specifically, a user inputs a keyword of a file to be downloaded, and the download link is intelligently acquired from the resource pool through a keyword matching method. The keyword matching method comprises the following steps: the method comprises the steps of accurate matching and fuzzy matching, wherein the accurate matching refers to a matching retrieval mode that an input keyword is used as a fixed phrase for retrieval, and the keyword is completely the same as a certain field in a file name corresponding to a download link. The fuzzy matching can automatically split the search word into unit concepts, and carry out logic and operation, and no matter what the position of the keyword in the file name corresponding to the download link is, the word can be generated.
In step S104, MD5 verification is performed on each download link. Specifically, the download method performs MD5 check on the collected download links, and discards the current download link if MD5 check fails.
It should be noted that MD5 Message Digest (MD5 Message-Digest Algorithm) is a widely used cryptographic hash function, which can generate a 128-bit (16-byte) hash value (hash value) to ensure the integrity of the Message transmission. Each file has a unique MD5 check value, and the MD5 check of the download links of the two files can judge whether the contents of the two files are the same.
In step S105, MD5 checks that the download links with the same value are sorted into the same download queue. Specifically, in the case that the MD5 check value determines that the download files of two download links are the same, the download files can be merged into the same download queue for multi-source segmented downloading. After the download queue is established, a new download link can be manually added, MD5 verification is carried out on the newly added download queue, the download queue can be automatically added into the download queue with the same MD5 verification value, and if the MD5 verification value is different from all the queues, a download queue is newly established for the download queue.
And step S106, testing the download link of each queue. Specifically, the test includes: testing the validity of the link, checking whether effective network connection can be established with the downloading link, and abandoning the current downloading link if the network connection cannot be established; and testing the link downloading speed, namely adding the current downloading link into a downloading speed prediction model to predict the future average downloading speed, and taking the future average downloading speed as the actual downloading speed of the downloading link.
And step S107, performing integrated calculation on each download queue according to the read-write speed of the disk, the bandwidth condition and the download speed of each download link according to a corresponding method, and selecting the download queue with the highest final download speed.
It should be noted that, in this embodiment, selecting the download queue with the highest final download speed specifically includes: accumulating the downloading speeds of all the downloading links of the HTTP sources in the current downloading queue, and recording the accumulated sum as A; accumulating the download link download speeds of all FTP sources in the current download queue, and recording the accumulated sum as B; c is made to be A + B, and C is marked as the fastest downloading speed under an ideal condition; if A exceeds the set highest downloading speed of the HTTP protocol, D is the highest downloading speed of the HTTP protocol, otherwise D is A; if B exceeds the set maximum download speed of the FTP protocol, E is the maximum download speed of the FTP protocol, otherwise E is B; and F is made to be D + E, if F exceeds the network bandwidth, F is made to be the bandwidth, if F exceeds the disk read-write speed, F is made to be the disk read-write speed, and finally F is made to be the final download speed of the current download queue.
Specifically, when selecting a download queue, if the volume difference between the downloaded files in two download queues exceeds 10%, the download queue needs to be manually selected.
Specifically, when the download queue is selected, if the final download speeds of the plurality of download queues are the same, the first download queue is selected by default, or the download queue may be selected manually.
Specifically, when selecting a download queue, a plurality of download queues may be manually selected to be downloaded simultaneously.
And step S108, dividing the file of the download queue into data blocks with different sizes according to the speed of each download link in the queue in proportion, distributing threads to the current download queue according to the number of the data blocks, wherein each thread is responsible for the corresponding data block and the download link, and downloading in parallel in multiple threads.
Specifically, for each download queue, the specific allocation manner of the data blocks is as follows: each download link gets assigned (download link download speed/ideally fastest download speed 100%) data blocks in the order in which the download links are added.
Specifically, the thread allocation mode of the download queue is as follows: if only one download queue is used for downloading, all available download threads are distributed to the download queue; if a plurality of download queues are used for downloading, the threads are divided into the download queues in a dividing mode by default, the default setting can be modified to distribute any number of threads to the download queues, and at least one thread is distributed to each download queue at will.
Specifically, the thread allocation mode of the data block is as follows: for each data block in any download queue, if the number of the data blocks is less than or equal to the thread number of the download queue, allocating one thread for each data block to carry out downloading, and only one download link in one thread carries out downloading work; if the number of the data blocks is larger than the number of the threads, the threads and the corresponding data blocks are distributed to the data blocks in sequence from high to low according to the downloading speed of the downloading links, the downloading links which are not distributed to the threads are added into the standby queue, and the data blocks which are not distributed are all classified into the threads where the downloading links with the highest downloading speed are located for downloading.
Specifically, the method for dynamically allocating the data blocks in the downloading process comprises the following steps: when one thread finishes downloading the data block, calling the downloading link out of the thread, calling a downloading speed prediction model again, predicting the downloading link and the future average downloading speed of all the downloading links in the standby queue again, selecting the downloading link with the highest downloading speed and adding the downloading link into the thread again; and calculating the residual downloading time of the other threads, selecting the thread with the longest residual downloading time according to a greedy strategy, and allocating the residual data blocks of the thread in equal proportion according to the downloading speed.
And after all the data blocks in each download queue are downloaded, uniformly assembling the data blocks into a complete file.
In the embodiment, the download queue is packaged according to the check value of the download link file MD5, multi-source multi-thread segmented downloading is carried out on the file division data block by using an intelligent algorithm, the network bandwidth is fully utilized, and the downloading time is effectively shortened; meanwhile, the transmission speeds of the HTTP protocol and the FTP protocol are subdivided, and the data flow can be effectively monitored.
Example four
In the embodiment, the encapsulation download queue is adopted, the file data blocks are intelligently divided, the data blocks are distributed with threads for multi-source segmented downloading, the network bandwidth is fully utilized, the downloading speed is improved, and various exceptions which may occur in the downloading process and after the downloading is completed are not processed and reported in error, so that the solution is provided in the embodiment.
As shown in fig. 3, a download exception handling process of a multi-source segmented parallel file downloading method includes:
in step S201, an abnormality is detected in the downloading process.
It should be noted that the abnormality includes: unknown errors, single download link timeout, all download link timeout, file I/O error, lost or unavailable resources, network connection failure, download speed too slow, and insufficient available disk space.
Step S202, judging the abnormal type.
Specifically, according to the abnormal type parameter transmitted when the abnormality occurs, the system judges the abnormal type and carries out the next processing.
In step S203, an error is unknown.
Specifically, the downloading task cannot be performed, and the system receives the parameter with the unknown error and enters a corresponding processing step.
And step S204, stopping downloading, reporting the error type, and selecting whether to select other downloading queues for downloading.
Specifically, because an unknown error occurs, the downloading task cannot be continued, the error is not in the system processing range, the downloading of the current downloading queue is stopped, the thread allocated to the downloading queue is released, the error type is reported to the system, and the downloading task is recorded into the log. Meanwhile, the system sends an option to inquire whether the user selects other download queues for downloading, and the user can select to save or delete the unfinished download file.
In step S205, the single download link times out.
Specifically, if the download progress of the download task in a certain thread does not change within a period of time, if a single download link timeout error occurs, the system receives a parameter that the single download link timeout occurs, and enters a corresponding processing step.
Step S206, waiting for the response of the download link, if the download link can not be reestablished within a certain time, stopping the download of the download link, distributing the data block to other download link tasks with higher download speed, reporting the error type, and marking the download link.
Specifically, because a single download link is overtime, the download progress of the download task in a certain thread is not changed within a period of time, if the connection cannot be reestablished within a certain period of time, the download link is called out of the thread, a download speed prediction model is called again, the future average download speed of all the download links in the standby queue is predicted again, the download link with the highest download speed is selected to join the thread again, and the download of the current data block task is continued. After the error is processed, the downloading task is recorded in the log, and the downloading link is marked as a timeout link and cannot be retrieved in the resource pool.
In step S207, all download links time out.
Specifically, if the download progress of the download tasks in all the threads is unchanged for a period of time, the network is normal and an effective network connection cannot be established with the download links, and if all the download links are in an overtime error, the system receives the parameters of all the download links, and enters a corresponding processing step.
Step S208, waiting for the response of the download link in the queue, if the connection can not be reestablished within a certain time or the download speed after the connection is established is lower than the lowest download speed, stopping all the downloads in the queue, reporting the error type, and selecting whether to select other download queues for downloading.
Specifically, because all the download links are overtime, the download progress of the download tasks in all the threads is not changed within a period of time, the network is normal and effective network connection with the download links cannot be established, if connection cannot be reestablished within a certain period of time, or the total download speed of the download queue is still lower than the minimum download speed after connection is reestablished, the system defaults to stop all the downloads in the queue, releases the threads distributed to the download queue, reports error types to the system at the same time, and records the download tasks into a log. Meanwhile, the system sends an option to inquire whether the user selects other download queues for downloading, and the user can select to save or delete the unfinished download file. The user may continue to use the queue for downloads below the minimum download rate, at which point the system will only allocate the number of processes in the queue for the number of download links that can still establish a connection. The reason for this anomaly may be that the channel for downloading the file is closed or the downloaded file is blocked.
Step S209, file I/O error.
Specifically, if the downloading progress of the downloading tasks in all threads is not changed within a period of time, the network is normal and can establish network connection with the downloading connection, and if a file I/O error occurs, the system receives the parameter of the file I/O error, and enters a corresponding processing step.
Step S210, checking whether the disk I/O can be normally carried out, if not, reporting the error type, and informing a user to select other disks for storage; and if the disk I/O can be normally carried out, stopping downloading, reporting the unknown error type, and entering an unknown error processing step.
Specifically, since all the download links are overtime, the download progress of the download tasks in all the threads does not change for a period of time, the network is normal and can establish network connection with the download connection, and then the disk I/O condition at the file storage location needs to be checked: if the disk I/O is normal, processing the position error; if the disk I/O has errors, reporting the error type to the system, and recording the downloading task into the log. And simultaneously, the system sends an option to inquire whether the user selects other disks to store the downloaded file, if the user selects other disks to store the file, the downloading of the downloading queue is carried out again, and if not, the downloading is stopped. The reason for this abnormality is usually to download a file into the usb disk and then forcibly pull out the usb disk.
In step S211, the resource is lost or not found.
Specifically, after the downloading is completed, the data blocks cannot be assembled into a complete file, and if the resource is lost or cannot be found, the system receives the parameter of the resource lost or cannot be found, and then enters a corresponding processing step.
In step S212, the downloaded file may be moved for the user or deleted, the error type is reported, and the lost resource is downloaded again.
Specifically, because the download speeds of the download links are not consistent, the download of the data blocks is not completed at the same time, and a user may move or delete a data block after the download of the data block is completed, which may cause a resource loss or a resource-missing error, and at this time, the number of the lost data block needs to be checked, a thread is allocated to the download queue again, and the lost data block is downloaded to the thread allocated to the user according to the download speed of the download link. And reporting the error type to the system, and recording the downloading task into a log.
In step S213, the network connection fails.
Specifically, if the downloading progress of the downloading tasks in all the threads is not changed within a period of time and the network connection cannot be established, the system receives the parameters of the network connection failure and enters a corresponding processing step if the network connection failure occurs.
Step S214, reporting error type, checking network connection, and trying to reestablish network connection.
Specifically, the system reports the error type because the network connection cannot be established, records the downloading task into a log, sends an option to inquire a user to check the network connection, rechecks the network connection after the user clicks confirmation, continues downloading if the network connection is normal, and repeats the steps if the network connection is not normal.
In step S215, the download speed is too slow.
Specifically, if the total download speed in the current download queue is lower than the minimum download speed within a period of time, an over-slow download error occurs, and the system receives the parameter of the over-slow download error and enters a corresponding processing step.
Step S216, waiting for a certain time, if the sum of all the download link speeds in the queue is still lower than the lowest download speed, reporting the error type, and selecting whether to select other download queues for downloading or continue downloading.
Specifically, since the total download speed of all the download links in the current download queue is lower than the minimum download speed, if the normal download speed cannot be recovered within a certain time, the system defaults to stop all the downloads in the queue, releases the threads allocated to the download queue, reports the error type to the system, and records the download task to the log. Meanwhile, the system sends an option to inquire whether the user selects other download queues for downloading, and the user can select to save or delete the unfinished download file. The user may continue to use the queue for downloads below the minimum download rate, at which point the system will only allocate the number of processes in the queue for the number of download links that can still establish a connection. The reason for this anomaly may be that the channel for downloading the file is limited or that the network bandwidth is occupied too much.
In step S217, there is not enough available disk space.
Specifically, if the current download queue has a normal download speed, the data block cannot be written, if the disk output is normal, an error occurs in the insufficient available disk space, and the system receives the parameter of the insufficient available disk space, and enters a corresponding processing step.
And step S218, reporting error types, reminding a user to clean the disk space, automatically continuing downloading after the user finishes cleaning, or selecting other storage positions to continue downloading.
Specifically, since the data block file cannot be written, the output of the disk is normal, the system stops all downloads in the queue by default, releases the threads allocated to the download queue, reports the error type to the system at the same time, and records the download task into the log. At the same time, the system issues options to ask the user to clear disk space or select other storage locations: if the user selects to clear the disk space, the disk remaining space is confirmed again, if the remaining space is enough, the downloading is continued, and if the remaining space is not enough, the steps are repeated; if the user selects other storage positions, the data block currently being downloaded is abandoned, the existing data block is copied to the new storage position, the original position data block is deleted after the copying is finished, and the missing data block is continuously downloaded.
In the embodiment, various abnormal intelligent processes are carried out in the downloading process and after the downloading is finished, and the data blocks from the invalid link or the low-speed link are automatically distributed to other normal links to finish the downloading, so that the reliability of downloading transmission is ensured, the downloading success rate is improved, and the downloading experience of a user is ensured.
EXAMPLE five
In this embodiment, the present invention constructs a downloading tool by using the methods proposed in the above embodiments, as shown in fig. 4, specifically including: a download setting module 301, a keyword download module 302, a download link download module 303, a BitTorrent file download module 304, a download distribution module 305, a download monitoring module 306, and a download completion module 307.
The download setting module 301 is configured to obtain a download environment set by a user, and includes: the download task comprises a download file storage position, a disk read-write speed of the download file storage position, a residual disk space of the download file storage position, an HTTP environment download parameter, an FTP environment download parameter, a network bandwidth parameter and a maximum CPU thread number allocated to a download task.
The keyword downloading module 302 is configured to collect keywords of a file to be downloaded input by a user, retrieve a specific number of downloading links from a resource pool through keyword precise matching or fuzzy matching, and transmit the downloading links to the downloading distribution module 305 for assembling a downloading queue.
The download link downloading module 303 is configured to collect one or more download link addresses input by a user, or identify a txt file uploaded by the user, extract an identifiable download link from the txt file through intelligent analysis, and transmit the download link to the download allocating module 305 for assembling a download queue.
The BitTorrent file downloading module 304 collects a remote BitTorrent file address input by a user or a local BitTorrent file for file downloading, or collects a BitTorrent Magnet URI input by the user for magnetic link downloading, the BitTorrent or magnetic link downloading is not distributed by the download distribution module, and cannot be added into a download queue, the download process is still managed by the download monitoring module 306, part of the exceptions are not applicable to the BitTorrent downloading, and the downloaded file is also managed by the download completion module 307.
The download allocation module 305 is configured to perform MD5 verification on each download link before downloading, assemble the download links with the same MD5 value into download queues, test the download speed of the download link of each download queue, divide the file of each download queue into data blocks with different sizes according to the speed of each download link in the queue, allocate a certain number of threads to the current download queue according to the number of the data blocks, where each thread is responsible for the corresponding data block and the download link, and run multithread simultaneously to download files in parallel. The user needs to select a download queue to be downloaded in the download distribution module, and transmit the download queue to the download monitoring module 306 for downloading.
And the download monitoring module 306 is used for downloading the file, monitoring the abnormality occurring in the downloading process, prompting the user, and processing the abnormality.
And a download completing module 307, configured to save the index file of the downloaded file after the file is downloaded, so as to facilitate subsequent searching and deleting, and also save a download queue, thereby facilitating subsequent repeated downloading or sharing.
The present invention has been described with reference to the above embodiments and the accompanying drawings, however, the above embodiments are only examples for carrying out the present invention. It should be noted that the disclosed embodiments do not limit the scope of the invention. Rather, modifications and equivalent arrangements included within the spirit and scope of the claims are included within the scope of the invention.

Claims (10)

1. A multi-source subsection parallel file downloading method is characterized by comprising the following steps:
the method comprises the following steps: performing MD5 verification on the acquired download links, and classifying the download links with the same MD5 value into the same download queue;
step two: acquiring the queue downloading speed of each downloading queue, and selecting the downloading queue with the highest speed to perform segmented parallel downloading of the files;
step three: dividing the file of the download queue into data blocks with different sizes according to the speed of each download link in the download queue in proportion, distributing threads to the current download queue according to the number of the data blocks, wherein each thread is responsible for the corresponding data block and the download link, and downloading in parallel in multiple threads;
step four: and after downloading, assembling the data blocks and combining the data blocks into a unified file.
2. The method of claim 1, wherein the first step is preceded by: acquiring a download link through a keyword matching acquisition mode and/or a manual acquisition mode;
wherein, obtaining the download link through the keyword matching obtaining mode includes:
collecting keywords of a target file to be downloaded;
and collecting the downloading links in the existing downloading resource pool according to the keywords.
3. The method of claim 1, wherein the first step is preceded by:
presetting a downloading environment, protocols, and highest downloading speed and lowest downloading speed under each protocol;
the first step further comprises the following steps:
carrying out validity test on the download link of each download queue to obtain the validity of each download link;
predicting the future downloading speed of the downloading link of each downloading queue through a downloading speed prediction model to obtain the future average downloading speed of each downloading link;
labeling a protocol source for the download link in each download queue, wherein the protocol source is used for reflecting a protocol corresponding to the download link;
the second step comprises the following steps:
accumulating the future average download speeds of the download links with the same protocol source in each download queue to obtain the accumulated download speed corresponding to each protocol;
judging whether the accumulated downloading speed exceeds the highest downloading speed of the corresponding protocol, if so, taking the highest downloading speed of the corresponding protocol as the actual downloading speed, and if not, taking the accumulated downloading speed as the actual downloading speed;
calculating the sum of actual downloading speeds, and determining the fastest downloading speed;
if the fastest downloading speed is respectively compared with the network bandwidth and the disk reading-writing speed in the downloading environment, and if the fastest downloading speed is less than the network bandwidth and the disk reading-writing speed, the fastest downloading speed is taken as the queue downloading speed; if the fastest downloading speed is greater than the network bandwidth or the disk reading-writing speed, taking the network bandwidth or the disk reading-writing speed as the corresponding queuing downloading speed; and if the fastest downloading speed is greater than the network bandwidth and the disk reading-writing speed, taking the lowest value of the two as the queue downloading speed.
4. The multi-source segmentation parallel file downloading method according to claim 3, wherein the second step further comprises: calculating the sum of the accumulated downloading speeds to obtain the ideal fastest downloading speed;
the third step comprises:
distributing data blocks for the download links according to the adding sequence, wherein the data blocks distributed by each download link are as follows: download link download speed/ideally fastest download speed 100%;
if only one download queue is used for downloading, all available download threads are distributed to the download queue; if a plurality of download queues are used for downloading, dividing the threads into each download queue in a dividing mode by default, modifying default setting to distribute any number of threads to the download queues, and distributing at least one thread to each download queue;
for each data block in any download queue, if the number of the data blocks is less than or equal to the thread number of the download queue, allocating one thread for each data block to carry out downloading, and only one download link in one thread carries out downloading work; if the number of the data blocks is larger than the number of the threads, the threads and the corresponding data blocks are distributed to the data blocks in sequence from high to low according to the downloading speed of the downloading links, the downloading links which are not distributed to the threads are added into the standby queue, and the data blocks which are not distributed are all classified into the threads where the downloading links with the highest downloading speed are located for downloading;
dynamically distributing data blocks in each thread in the downloading process, calling the downloading link out of the thread after one thread finishes downloading the data blocks, calling a downloading speed prediction model to predict the downloading link and the future average downloading speed of all the downloading links in the standby queue again, selecting the downloading link with the highest downloading speed to join the thread again, calculating the residual downloading time of the rest threads, selecting the thread with the longest residual downloading time, and proportionally distributing the residual data blocks of the thread according to the downloading speed.
5. The multi-source segmentation parallel file downloading method according to claim 1, further comprising after the third step: monitoring the abnormality in the downloading process, prompting a user, and processing the abnormality; wherein the anomalies include one or more of: unknown errors, single download link overtime, all download link overtime, file I/O errors, resource loss or no resources found, network connection failure, too slow download speed, insufficient available disk space;
the fourth step further comprises:
and after the file is downloaded, storing the index file and the download queue of the downloaded file.
6. A multi-source, piecewise-parallel file download tool, comprising:
a download distribution module to:
performing MD5 verification on the acquired download links, and classifying the download links with the same MD5 value into the same download queue;
acquiring the queue downloading speed of each downloading queue, and selecting the downloading queue with the highest speed to perform segmented parallel downloading of the files;
dividing files of a download queue into data blocks with different sizes according to the speed of each download link in the download queue in proportion, and allocating threads to the current download queue according to the number of the data blocks, wherein each thread is used for being responsible for the corresponding data block and the download link and carrying out multi-thread parallel downloading;
the download monitoring module is used for downloading files;
and the download completion module is used for assembling the data blocks after the download is completed and combining the data blocks into a unified file.
7. The tool of claim 6, further comprising: the system comprises a BitTorrent file downloading module, a keyword downloading module and/or a manual acquiring module, wherein the keyword downloading module is used for:
collecting keywords of a file to be downloaded; collecting downloading links in an existing downloading resource pool according to the keywords;
a manual acquisition module to:
collecting one or more download link addresses input by a user, or identifying a txt file uploaded by the user, extracting an identifiable download link from the txt file by analyzing the txt file, and transmitting the collected download link to a download distribution module for assembling a download queue;
the BitTorrent file downloading module is used for collecting a remote BitTorrent file address input by a user or a local BitTorrent file for file downloading, or collecting a BitTorrent Magnet URI input by the user for magnetic link downloading.
8. The tool of claim 6, further comprising: the download setting module is used for acquiring a download environment set by a user, wherein the download environment comprises: the download file storage position, the disk read-write speed of the download file storage position, the residual disk space of the download file storage position, HTTP environment download parameters, FTP environment download parameters, network bandwidth parameters and the maximum CPU thread number allocated to the download task;
the download distribution module is further configured to:
carrying out validity test on the download link of each download queue to obtain the validity of each download link;
predicting the future downloading speed of the downloading link of each downloading queue through a downloading speed prediction model to obtain the future average downloading speed of each downloading link, and taking the predicted future average downloading speed as the actual downloading speed of the downloading link;
labeling a protocol source for the download link in each download queue, wherein the protocol source is used for reflecting a protocol corresponding to the download link;
the obtaining of the queue downloading speed of each downloading queue specifically includes:
accumulating the future average download speeds of the download links with the same protocol source in each download queue to obtain the accumulated download speed corresponding to each protocol;
judging whether the accumulated downloading speed exceeds the highest downloading speed of the corresponding protocol, if so, taking the highest downloading speed of the corresponding protocol as the actual downloading speed, and if not, taking the accumulated downloading speed as the actual downloading speed;
calculating the sum of actual downloading speeds, and determining the fastest downloading speed;
if the fastest downloading speed is respectively compared with the network bandwidth and the disk reading-writing speed in the downloading environment, and if the fastest downloading speed is less than the network bandwidth and the disk reading-writing speed, the fastest downloading speed is taken as the queue downloading speed; if the fastest downloading speed is greater than the network bandwidth or the disk reading-writing speed, taking the network bandwidth or the disk reading-writing speed as the corresponding queuing downloading speed; and if the fastest downloading speed is greater than the network bandwidth and the disk reading-writing speed, taking the lowest value of the two as the queue downloading speed.
9. The tool of claim 8, wherein the download distribution module is specifically configured to:
distributing data blocks for the download links according to the adding sequence, wherein the data blocks distributed by each download link are as follows: download link download speed/ideally fastest download speed 100%;
if only one download queue is used for downloading, all available download threads are distributed to the download queue; if a plurality of download queues are used for downloading, dividing the threads into each download queue in a dividing mode by default, modifying default setting to distribute any number of threads to the download queues, and distributing at least one thread to each download queue;
for each data block in any download queue, if the number of the data blocks is less than or equal to the thread number of the download queue, allocating one thread for each data block to carry out downloading, and only one download link in one thread carries out downloading work; if the number of the data blocks is larger than the number of the threads, the threads and the corresponding data blocks are distributed to the data blocks in sequence from high to low according to the downloading speed of the downloading links, the downloading links which are not distributed to the threads are added into the standby queue, and the data blocks which are not distributed are all classified into the threads where the downloading links with the highest downloading speed are located for downloading;
dynamically distributing data blocks in each thread in the downloading process, calling the downloading link out of the thread after one thread finishes downloading the data blocks, calling a downloading speed prediction model to predict the downloading link and the future average downloading speed of all the downloading links in the standby queue again, selecting the downloading link with the highest downloading speed to join the thread again, calculating the residual downloading time of the rest threads, selecting the thread with the longest residual downloading time, and proportionally distributing the residual data blocks of the thread according to the downloading speed.
10. The tool of claim 6, wherein the download monitoring module is further configured to: downloading the file, monitoring the abnormality in the downloading process, prompting a user, and processing the abnormality; wherein the anomalies include one or more of: unknown errors, single download link overtime, all download link overtime, file I/O errors, resource loss or no resources found, network connection failure, too slow download speed, and insufficient available disk space;
the download completion module is also used for storing the index file and the download queue of the downloaded file after the file download is completed.
CN202210137400.1A 2022-02-15 2022-02-15 Multi-source segmented parallel file downloading method and tool Active CN114640665B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210137400.1A CN114640665B (en) 2022-02-15 2022-02-15 Multi-source segmented parallel file downloading method and tool

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210137400.1A CN114640665B (en) 2022-02-15 2022-02-15 Multi-source segmented parallel file downloading method and tool

Publications (2)

Publication Number Publication Date
CN114640665A true CN114640665A (en) 2022-06-17
CN114640665B CN114640665B (en) 2023-02-10

Family

ID=81945952

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210137400.1A Active CN114640665B (en) 2022-02-15 2022-02-15 Multi-source segmented parallel file downloading method and tool

Country Status (1)

Country Link
CN (1) CN114640665B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115086304A (en) * 2022-07-08 2022-09-20 甘肃省气象信息与技术装备保障中心 Multi-source distributed downloading system based on FTP protocol

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1852307A (en) * 2005-10-10 2006-10-25 华为技术有限公司 Data downloading method
CN101079709A (en) * 2006-06-15 2007-11-28 腾讯科技(深圳)有限公司 Single-node-to-multi-node concurrent download system and method
US20120271880A1 (en) * 2011-04-19 2012-10-25 Accenture Global Services Limited Content transfer accelerator
CN105025068A (en) * 2014-04-30 2015-11-04 腾讯科技(深圳)有限公司 Network data downloading method and apparatus
CN107347092A (en) * 2017-06-30 2017-11-14 环球智达科技(北京)有限公司 The method downloaded for multithreading
CN108076117A (en) * 2016-11-14 2018-05-25 腾讯科技(深圳)有限公司 A kind of data download method and user terminal
CN110784520A (en) * 2019-09-30 2020-02-11 北京字节跳动网络技术有限公司 File downloading method and device and electronic equipment
CN112492033A (en) * 2020-11-30 2021-03-12 深圳市移卡科技有限公司 File transmission method, system and computer readable storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1852307A (en) * 2005-10-10 2006-10-25 华为技术有限公司 Data downloading method
CN101079709A (en) * 2006-06-15 2007-11-28 腾讯科技(深圳)有限公司 Single-node-to-multi-node concurrent download system and method
US20120271880A1 (en) * 2011-04-19 2012-10-25 Accenture Global Services Limited Content transfer accelerator
CN105025068A (en) * 2014-04-30 2015-11-04 腾讯科技(深圳)有限公司 Network data downloading method and apparatus
CN108076117A (en) * 2016-11-14 2018-05-25 腾讯科技(深圳)有限公司 A kind of data download method and user terminal
CN107347092A (en) * 2017-06-30 2017-11-14 环球智达科技(北京)有限公司 The method downloaded for multithreading
CN110784520A (en) * 2019-09-30 2020-02-11 北京字节跳动网络技术有限公司 File downloading method and device and electronic equipment
CN112492033A (en) * 2020-11-30 2021-03-12 深圳市移卡科技有限公司 File transmission method, system and computer readable storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115086304A (en) * 2022-07-08 2022-09-20 甘肃省气象信息与技术装备保障中心 Multi-source distributed downloading system based on FTP protocol
CN115086304B (en) * 2022-07-08 2024-04-19 甘肃省气象信息与技术装备保障中心 Multi-source distributed downloading system based on FTP protocol

Also Published As

Publication number Publication date
CN114640665B (en) 2023-02-10

Similar Documents

Publication Publication Date Title
US7873594B2 (en) System analysis program, system analysis method, and system analysis apparatus
WO2019227689A1 (en) Data monitoring method and apparatus, and computer device and storage medium
US7631034B1 (en) Optimizing node selection when handling client requests for a distributed file system (DFS) based on a dynamically determined performance index
EP2338112B1 (en) Batch processing system
CN109710615B (en) Database access management method, system, electronic device and storage medium
CN107370806B (en) HTTP status code monitoring method, device, storage medium and electronic equipment
US20060048155A1 (en) Organizing transmission of repository data
US20050060493A1 (en) Negotiated distribution of cache content
CN112835792B (en) Pressure testing system and method
CN104584524A (en) Aggregating data in a mediation system
KR101945430B1 (en) Method for improving availability of cloud storage federation environment
CN114640665B (en) Multi-source segmented parallel file downloading method and tool
CN112988679B (en) Log acquisition control method and device, storage medium and server
CN112288092A (en) Model evaluation method, model evaluation device, electronic device and storage medium
CN113485999A (en) Data cleaning method and device and server
CN113760677A (en) Abnormal link analysis method, device, equipment and storage medium
CN113778810A (en) Log collection method, device and system
CN115860709A (en) Software service guarantee system and method
US11243857B2 (en) Executing test scripts with respect to a server stack
US11200138B1 (en) Policy-based request tracing using a computer
CN113704203A (en) Log file processing method and device
US20060015593A1 (en) Three dimensional surface indicating probability of breach of service level
US8015207B2 (en) Method and apparatus for unstructured data mining and distributed processing
CN116431872B (en) Observable system and service observing method based on observable system
CN113434376B (en) Web log analysis method and device based on NoSQL

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant