CN106850778B - Multi-source download performance optimization method and device - Google Patents

Multi-source download performance optimization method and device Download PDF

Info

Publication number
CN106850778B
CN106850778B CN201710031275.5A CN201710031275A CN106850778B CN 106850778 B CN106850778 B CN 106850778B CN 201710031275 A CN201710031275 A CN 201710031275A CN 106850778 B CN106850778 B CN 106850778B
Authority
CN
China
Prior art keywords
downloading
download
data
data sources
source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201710031275.5A
Other languages
Chinese (zh)
Other versions
CN106850778A (en
Inventor
陈茜
苗欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Tsinghua National Laboratory For Information Science And Technology Internet Of Things Technology Center
Original Assignee
Wuxi Tsinghua National Laboratory For Information Science And Technology Internet Of Things Technology Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Tsinghua National Laboratory For Information Science And Technology Internet Of Things Technology Center filed Critical Wuxi Tsinghua National Laboratory For Information Science And Technology Internet Of Things Technology Center
Priority to CN201710031275.5A priority Critical patent/CN106850778B/en
Publication of CN106850778A publication Critical patent/CN106850778A/en
Application granted granted Critical
Publication of CN106850778B publication Critical patent/CN106850778B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • G06F16/1824Distributed file systems implemented using Network-attached Storage [NAS] architecture
    • G06F16/183Provision of network file services by network file servers, e.g. by using NFS, CIFS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a multi-source download performance optimization method and device, wherein the method classifies log data of a cloud end and finds out the same type of log data according to requirements; analyzing and comparing various measurement indexes of the classified log data, and counting and calculating to obtain a required result; comprehensively comparing and analyzing the results of the measurement indexes, and setting an applicable downloading scheme according to different downloading characteristics; sending the downloading scheme obtained after the cloud analysis to the client; analyzing data acquired in real time in the downloading process to obtain the current downloading performance; the method and the corresponding device for matching the current download with the existing download scheme and selecting the proper download scheme solve the problems that the performance of downloading by using more data sources in the current multi-source download is poor, the abandon rate of downloading by using the charging data sources is high and the like.

Description

Multi-source download performance optimization method and device
Technical Field
The invention relates to the technical field of network content distribution, in particular to a multi-source downloading performance optimization method and device.
Background
File downloading, the most ubiquitous piece of web service at the very least, has undergone several generations of supporting technologies, including direct content distribution (C/S), Content Distribution Networks (CDNs), peer-to-peer networks (P2P), and cloud-based technologies. However, the current network infrastructure is not able to meet the ever increasing number of users and data transmissions. Therefore, the performance of file downloading (such as downloading speed and downloading success rate) is not satisfactory.
To speed up file downloads, many network service and content providers employ the latest current file download approach, multi-source file download. In multi-source downloading, a user (i.e., client) may obtain different portions of a requested file from multiple data sources simultaneously through a variety of content distribution techniques and protocols. Such as: in a P2P file download system, a client may maintain tens of TCP peer connections to concurrently retrieve different data blocks of a requested file. When the downloading mode of P2P is extended to a mixed mode of P2SP (peer-to-server & peer), the user can obtain the data blocks from a dedicated server in addition to the peer. Intuitively, multi-source downloading can effectively improve the performance of file (especially large file) downloads. This has indeed been demonstrated by a number of previous research efforts. But over time, the interaction between the client and the multiple data sources now becomes much more complex than a few years ago. If the multi-source download agreement cannot be reasonably designed, it may result in poorer performance and higher network and financial overhead than the original single-source download. This phenomenon is referred to as performance anomaly for multi-source downloads. Although performance anomalies occur on the client side, there is currently a lack of research work to fully understand these performance anomalies in real systems, let alone to guide users and developers to deal with these anomalies.
In order to be able to fully understand the performance anomalies in multi-source downloads, previous research work has used large-scale industrial multi-source download data to deeply analyze the characteristics of the seven most commonly used download technologies (i.e., C/S, free C/S mirror, charged C/S mirror, free content distribution network, charged content distribution network, network facilitator cache, and peer-to-peer network). The commonly used multi-source download process in many popular downloaders (e.g., QQ cyclone, thunder, etc.) in the industry at present is as follows: after receiving a download request from a user, a downloader installed at the client starts downloading a requested file from an original data source. Meanwhile, the cloud of the downloader can also inquire available data sources and inform the found data sources to the client, so that the client is upgraded from source chain downloading to multi-source downloading, and the purpose of downloading acceleration is achieved. After downloading, the client sends the data recorded in the whole downloading process to the cloud of the downloader. Therefore, by using the log data acquired in the process, the performance comparison between the multi-source download and the source chain download and the characteristics of the seven types of download technologies can be analyzed and obtained, and the performance abnormal phenomenon in the multi-source download can be comprehensively analyzed and summarized. The performance anomalies of multi-source downloading found by previous research work are mainly: the popularity of a file and the amount of resources of the file are not completely positively correlated; when the downloading duration of the data sources used in the downloading is inconsistent, the downloading performance of the multi-source downloading is often poor; the more data sources are not used in multi-source downloading, the better the downloading performance is; the data sources charged tend to be worse in performance due to bandwidth limitations and the like. In order to understand the root cause of the abnormality, research work focuses on exploring some general measurement indexes (such as download speed, download time consumption, data source quantity, file size, file popularity and the like) and irregular measurement indexes (such as participation time diversity, rejection rate, estimated remaining download time and the like) in multi-source downloading. The analysis results show that when the value of some measurement index in a certain download is obviously different from that in other downloads, the download performance is often not as good as the expected ideal performance. Thus, through experimental analysis, previous work concluded some valid and practically useful insights and conclusions to guide the algorithm design of the downloader to be able to effectively handle these anomalies that occur in multi-source downloads.
Disclosure of Invention
The present invention is directed to a method and apparatus for optimizing multi-source download performance to solve the problems mentioned in the background section above.
In order to achieve the purpose, the invention adopts the following technical scheme:
a multi-source download performance optimization method comprises the following steps:
s101, the cloud end sorts the existing log data and classifies the log data according to the number of data sources used in downloading, and all logs downloaded by using the same number of data sources are found out;
s102, the cloud end carries out comparative analysis on various classified log data, and the maximum value, the minimum value, the median value and the mean value of various measurement indexes with different file sizes in each data are obtained through statistics;
s103, comprehensively comparing and analyzing the values of the measurement indexes obtained through statistics by the cloud, and summarizing and concluding downloading schemes suitable for downloading with different characteristics;
s104, the cloud end summarizes the downloading schemes with different characteristics and sends the downloading schemes to the client end;
s105, the client collects and analyzes data in the downloading process to obtain the current downloading performance;
s106, the client takes the current downloading performance and the characteristics of the downloaded file as characteristic information of the current downloading, and selects an optimal downloading scheme to adjust so as to optimize the performance of multi-source downloading.
Specifically, the step S101 includes: the cloud end sorts the download logs from small to large according to three columns of the user ID, the file ID and the download completion time; recording the data sources with the same user ID, file ID and download completion time as one-time download, wherein the log number in the one-time download is the number of the used data sources; downloads having the same number of data sources are classified as one.
In particular, said step S102 comprises: sequencing each type of log data from small to large according to the file size; calculating the maximum value, the minimum value, the median value and the mean value of various downloaded measurement indexes with different file sizes in each type of data, wherein the measurement indexes required to be calculated comprise: average download speed (AS), download Success Rate (SR), download Abandon Rate (AR), diversity of data source participation time (DPT).
Specifically, the average download speed (AS), the download Success Rate (SR), the download Abandon Rate (AR), and the diversity of data source participation time (DPT) in step S102 are calculated AS follows:
Figure GDA0002520542410000041
Figure GDA0002520542410000042
wherein, download size represents the number of bytes downloaded, download time represents the downloaded time length, success number represents the number of tasks successfully downloaded, total number represents the number of all downloaded tasks, CancelNumber represents the number of tasks actively cancelled by the user, N represents the number of data sources participating in downloading, T represents the number of data sources participating in downloading, andirepresenting the ith data source in one download taskDuration of participation in the download, TmaxRepresents the maximum time length T of all data sources participating in downloading in one downloading taskminRepresents the minimum time for all data sources to participate in the download in one download task,
Figure GDA0002520542410000043
and the average value of the time length of all data sources participating in the downloading in one downloading task is represented.
Specifically, the step S103 includes: comparing three measurement indexes of average speed, downloading success rate and diversity of data source participation time under the condition of using different numbers of data sources and different file sizes, and analyzing to obtain the upper limit of the number of the data sources required to be used in the downloading of the different file sizes;
and comparing three measurement indexes of average speed, downloading success rate and downloading abandon rate of multi-source downloading under the conditions of data source charging and free, and analyzing to obtain a downloading strategy adopted by the charging file, namely preferentially using the free data source for downloading.
Specifically, the step S105 includes: acquiring the size of a file in real-time downloading, the number of currently used data sources, the downloading time length and the corresponding number of bytes of each data source, and the resource type used by each data source, namely charging or free; and calculating to obtain the real-time downloading Average Speed (AS) of each data source.
In particular, said step S106 comprises: if the number of the currently used data sources is less than or equal to the upper limit of the number of the data sources and the used data sources are free resources, maintaining the current download link;
if the used data source has the charged resource, the charged resource is replaced by the free resource, and if the free resource is not available, the charged resource is continuously used for downloading;
and if the number of the currently used data sources is larger than the upper limit of the number of the data sources, interrupting the data sources with poor quality, namely the data sources with low average downloading speed.
Corresponding to the multi-source download performance optimization method, the invention also discloses a multi-source download performance optimization device, which comprises:
the log data analysis module is used for sorting the existing log data through the cloud end and classifying the log data according to the number of data sources used in downloading to find out all logs downloaded by using the same number of data sources;
the classified data statistics module is used for carrying out comparative analysis on various classified log data through the cloud end, and carrying out statistics to obtain the maximum value, the minimum value, the median value and the mean value of various measurement indexes with different file sizes in each type of data;
the characteristic scheme association module is used for comprehensively comparing and analyzing the values of the measurement indexes obtained through statistics through the cloud end, and summarizing and concluding downloading schemes suitable for downloading of different characteristics;
the data transmission module is used for summarizing the downloading schemes with different characteristics and suitable for downloading through the cloud and then sending the summarized downloading schemes to the client;
the real-time data analysis module is used for collecting and analyzing data in the downloading process of the client to obtain the current downloading performance;
and the download scheme selection module is used for selecting the optimal download scheme to adjust so as to optimize the multi-source download performance by taking the current download performance and the characteristics of the downloaded file as the characteristic information of the current download.
The multi-source download performance optimization method and device classify the log data of the cloud end, and find out the log data of the same type according to requirements; analyzing and comparing various measurement indexes of the classified log data, and counting and calculating to obtain a required result; comprehensively comparing and analyzing the results of the measurement indexes, and setting an applicable downloading scheme according to different downloading characteristics; sending the downloading scheme obtained after the cloud analysis to the client; analyzing data acquired in real time in the downloading process to obtain the current downloading performance; the method and the corresponding device for matching the current download with the existing download scheme and selecting the proper download scheme solve the problems that the performance of downloading by using more data sources in the current multi-source download is poor, the abandon rate of downloading by using the charging data sources is high and the like.
Drawings
Fig. 1 is a flowchart of a multi-source download performance optimization method provided in an embodiment of the present invention;
fig. 2 is a schematic diagram of a multi-source download performance optimization apparatus according to an embodiment of the present invention.
Detailed Description
To facilitate an understanding of the invention, the invention will now be described more fully with reference to the accompanying drawings. Preferred embodiments of the present invention are shown in the drawings. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete. It will be understood that when an element is referred to as being "connected" to another element, it can be directly connected to the other element or intervening elements may also be present. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
Referring to fig. 1, fig. 1 is a flowchart of a multi-source download performance optimization method according to an embodiment of the present invention. The multi-source download performance optimization method in the embodiment comprises the following steps:
s101, the cloud end sorts the existing log data and classifies the log data according to the number of the data sources used in downloading, and all logs downloaded by using the same number of data sources are found out.
The cloud end sorts the download logs from small to large according to three columns of the user ID, the file ID and the download completion time; if N data sources are used in a download task, the N data sources have the same file ID, user ID and download completion time, so that different download tasks can be distinguished(ii) a Recording the data sources with the same user ID, file ID and download completion time as one-time download, wherein the log number in the one-time download is the number of the used data sources; downloads with the same number of data sources are classified as: recording the number of data sources used in the downloading task A as NaThe number of data sources used in the download task B is NbIf N is presenta=NbThey belong to one class of download, otherwise they do not belong to the same class of download.
S102, the cloud end carries out comparative analysis on various classified log data, and the maximum value, the minimum value, the median value and the mean value of various measurement indexes with different file sizes in each data are obtained through statistics.
Sequencing each type of log data from small to large according to the file size: if the download of k-type is obtained in step S101, S is used respectively1,S2,……,SkRepresenting the resulting set of log data for the k-type download. Now with SiStep S102 is specifically described as an example. Will SiThe log data in (1) is sorted from small to large according to the file size.
Calculating the maximum value, the minimum value, the median value and the mean value of various downloaded measurement indexes with different file sizes in each type of data, wherein the measurement indexes required to be calculated comprise: average download speed (AS), download Success Rate (SR), download Abandon Rate (AR), and data source participation time Diversity (DPT), which are respectively recorded AS ASi,SRi,ARi,DPTi
The average download speed (AS), the download Success Rate (SR), the download Abandon Rate (AR), and the diversity of data source participation time (DPT) are calculated AS follows:
Figure GDA0002520542410000071
Figure GDA0002520542410000072
wherein, download size represents the number of bytes downloaded, download time represents the downloaded time length, and success number represents the followingThe number of successfully loaded tasks, TotalNumber represents the number of all downloaded tasks, CancelNumber represents the number of tasks actively cancelled by a user, N represents the number of data sources participating in downloading, and T represents the number of data sources participating in downloadingiRepresents the time length of the ith data source participating in downloading in one downloading task, TmaxRepresents the maximum time length T of all data sources participating in downloading in one downloading taskminRepresents the minimum time for all data sources to participate in the download in one download task,
Figure GDA0002520542410000081
and the average value of the time length of all data sources participating in the downloading in one downloading task is represented.
S103, the cloud comprehensively performs comparative analysis on the values of the measurement indexes obtained through statistics, and summarizes downloading schemes suitable for downloading with different characteristics.
Comprehensively comparing three measurement indexes of average speed, downloading success rate and diversity of data source participation time under the condition of using different numbers of data sources and different file sizes, and analyzing to obtain the upper limit of the number of the data sources required to be used in the downloading of the different file sizes;
due to the fact that the abnormality occurs in multi-source downloading, namely the downloading performance of a large number of data sources is poor, for downloading files with different sizes, when different numbers of data sources are used, the values of the three measurement indexes always show regular fluctuation, and therefore the optimal upper limit of the number of the data sources used in the downloading of the different file sizes can be obtained through analysis;
comprehensively comparing three measurement indexes of average speed of multi-source downloading, downloading success rate and downloading abandon rate under the condition of data source charging and free. The previous research work finds that the abandon rate of the downloading using the charging data source is higher, and the analysis finds that under the general condition, a service provider can limit the speed of the downloading using the charging resource, so that the downloading speed is too low, and the higher downloading abandon rate is caused, therefore, for the charging file, the optimal downloading strategy is to download by preferentially using the free data source.
And S104, the cloud summarizes the downloading schemes with different characteristics and sends the downloading schemes to the client.
In this process, the data to be sent is the upper limit of the number of data sources corresponding to different file sizes.
And S105, the client collects and analyzes data in the downloading process to obtain the current downloading performance.
In the process, the data to be acquired mainly comprises the file size in real-time downloading, the number N of currently used data sources, the downloading time length and the corresponding downloaded byte number of each data source, and the resource type used by each data source, namely charging or free; and calculating to obtain the real-time downloading average speed of each data source.
S106, the client takes the current downloading performance and the characteristics of the downloaded file as characteristic information of the current downloading, and selects an optimal downloading scheme to adjust so as to optimize the performance of multi-source downloading.
When the file size is F, if the number N of the currently used data sources is less than or equal to the upper limit N of the number of the data sourcesfAnd the used data sources are all free resources, and then the current download link is kept;
when the file size is F, if the used data source has the charged resource, the charged resource is replaced by the free resource, and if the free resource is not available, the charged resource is continuously used for downloading;
when the file size is F, if the number N of the currently used data sources is larger than the upper limit N of the number of the data sourcesfThe source with the poorer quality, i.e. the source with the lower average download speed, is interrupted. It should be noted that, in the "average download speed is low", which threshold the specific download speed is lower than is considered as the current average download speed is low, and the specific download speed can be flexibly set according to actual situations.
As shown in fig. 2, fig. 2 is a schematic diagram of a multi-source download performance optimization apparatus according to an embodiment of the present invention.
Corresponding to the above multi-source download performance optimization method, this embodiment further discloses a multi-source download performance optimization device, which specifically includes:
the log data analysis module 201 is configured to sort existing log data through the cloud, classify the existing log data according to the number of data sources used in downloading, and find out all logs downloaded by using the same number of data sources.
The cloud end sorts the download logs from small to large according to three columns of the user ID, the file ID and the download completion time; if N data sources are used in one download task, the N data sources have the same file ID, user ID and download completion time, so that different download tasks can be distinguished; recording the data sources with the same user ID, file ID and download completion time as one-time download, wherein the log number in the one-time download is the number of the used data sources; downloads with the same number of data sources are classified as: recording the number of data sources used in the downloading task A as NaThe number of data sources used in the download task B is NbIf N is presenta=NbThey belong to one class of download, otherwise they do not belong to the same class of download.
The classification data statistics module 202 is configured to compare and analyze various classified log data through the cloud end, and perform statistics to obtain a maximum value, a minimum value, a median value, and a mean value of various measurement indexes of different file sizes in each type of data.
Sequencing each type of log data from small to large according to the file size: if the download of k-type is obtained in step S101, S is used respectively1,S2,……,SkRepresenting the resulting set of log data for the k-type download. Now with SiStep S102 is specifically described as an example. Will SiThe log data in (1) is sorted from small to large according to the file size.
Calculating the maximum value, the minimum value, the median value and the mean value of various downloaded measurement indexes with different file sizes in each type of data, wherein the measurement indexes required to be calculated comprise: average download speed (AS), download Success Rate (SR), download Abandon Rate (AR), and data source participation time Diversity (DPT), which are respectively recorded AS ASi,SRi,ARi,DPTi
The method for calculating the average download speed (AS), the download Success Rate (SR), the download Abandon Rate (AR) and the diversity of the data source participation time (DPT) is the same AS the method.
And the characteristic scheme association module 203 is used for comprehensively comparing and analyzing the values of the measurement indexes obtained through statistics through the cloud end, and summarizing and concluding the downloading schemes suitable for downloading of different characteristics.
Comprehensively comparing three measurement indexes of average speed, downloading success rate and diversity of data source participation time under the condition of using different numbers of data sources and different file sizes, and analyzing to obtain the upper limit of the number of the data sources required to be used in the downloading of the different file sizes;
due to the fact that the abnormality occurs in multi-source downloading, namely the downloading performance of a large number of data sources is poor, for downloading files with different sizes, when different numbers of data sources are used, the values of the three measurement indexes always show regular fluctuation, and therefore the optimal upper limit of the number of the data sources used in the downloading of the different file sizes can be obtained through analysis;
comprehensively comparing three measurement indexes of average speed of multi-source downloading, downloading success rate and downloading abandon rate under the condition of data source charging and free. The previous research work finds that the abandon rate of the downloading using the charging data source is higher, and the analysis finds that under the general condition, a service provider can limit the speed of the downloading using the charging resource, so that the downloading speed is too low, and the higher downloading abandon rate is caused, therefore, for the charging file, the optimal downloading strategy is to download by preferentially using the free data source.
And the data transmission module 204 is configured to summarize the downloading schemes with different characteristics that are suitable for downloading, and then send the summarized downloading schemes to the client.
In this process, the data to be sent is the upper limit of the number of data sources corresponding to different file sizes.
And the real-time data analysis module 205 is used for collecting and analyzing data in the downloading process by the client to obtain the current downloading performance.
In the process, the data to be acquired mainly comprises the file size in real-time downloading, the number N of currently used data sources, the downloading time length and the corresponding downloaded byte number of each data source, and the resource type used by each data source, namely charging or free; and calculating to obtain the real-time average downloading speed of each data source.
And the download scheme selection module 206 is configured to select an optimal download scheme for adjustment to optimize the multi-source download performance by using the current download performance and the characteristics of the downloaded file as the characteristic information of the current download.
When the file size is F, if the number N of the currently used data sources is less than or equal to the upper limit N of the number of the data sourcesfAnd the used data sources are all free resources, and then the current download link is kept;
when the file size is F, if the used data source has the charged resource, the charged resource is replaced by the free resource, and if the free resource is not available, the charged resource is continuously used for downloading;
when the file size is F, if the number N of the currently used data sources is larger than the upper limit N of the number of the data sourcesfThe source with the poorer quality, i.e. the source with the lower average download speed, is interrupted. It should be noted that, in the "average download speed is low", which threshold the specific download speed is lower than is considered as the current average download speed is low, and the specific download speed can be flexibly set according to actual situations.
According to the technical scheme, the log data of the cloud are classified, and the log data of the same type are found out according to requirements; analyzing and comparing various measurement indexes of the classified log data, and counting and calculating to obtain a required result; comprehensively comparing and analyzing the results of the measurement indexes, and setting an applicable downloading scheme according to different downloading characteristics; sending the downloading scheme obtained after the cloud analysis to the client; analyzing data acquired in real time in the downloading process to obtain the current downloading performance; the method and the corresponding device for matching the current download with the existing download scheme and selecting the proper download scheme solve the problems that the performance of downloading by using more data sources in the current multi-source download is poor, the abandon rate of downloading by using the charging data sources is high and the like.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The technical principle of the present invention is described above in connection with specific embodiments. The description is made for the purpose of illustrating the principles of the invention and should not be construed in any way as limiting the scope of the invention. Based on the explanations herein, those skilled in the art will be able to conceive of other embodiments of the present invention without inventive effort, which would fall within the scope of the present invention.

Claims (5)

1. A multi-source download performance optimization method is characterized by comprising the following steps:
s101, the cloud end sorts the existing log data and classifies the log data according to the number of data sources used in downloading, and all logs downloaded by using the same number of data sources are found out; wherein the step S101 includes: the cloud end sorts the download logs from small to large according to three columns of the user ID, the file ID and the download completion time; recording the data sources with the same user ID, file ID and download completion time as one-time download, wherein the log number in the one-time download is the number of the used data sources; sorting downloads having the same number of data sources into one class;
s102, the cloud end carries out comparative analysis on various classified log data, and the maximum value, the minimum value, the median value and the mean value of various measurement indexes with different file sizes in each data are obtained through statistics; wherein the step S102 includes: sequencing each type of log data from small to large according to the file size; calculating the maximum value, the minimum value, the median value and the mean value of various downloaded measurement indexes with different file sizes in each type of data, wherein the measurement indexes required to be calculated comprise: average download speed (AS), download Success Rate (SR), download Abandon Rate (AR), diversity of data source participation time (DPT); the method for calculating the downloading Average Speed (AS), the downloading Success Rate (SR), the downloading Abandon Rate (AR) and the diversity of the data source participation time (DPT) is AS follows:
Figure FDA0002520542400000011
Figure FDA0002520542400000012
wherein, download size represents the number of bytes downloaded, download time represents the downloaded time length, success number represents the number of tasks successfully downloaded, total number represents the number of all downloaded tasks, CancelNumber represents the number of tasks actively cancelled by the user, N represents the number of data sources participating in downloading, T represents the number of data sources participating in downloading, andirepresents the time length of the ith data source participating in downloading in one downloading task, TmaxRepresents the maximum time length T of all data sources participating in downloading in one downloading taskminRepresents the minimum time for all data sources to participate in the download in one download task,
Figure FDA0002520542400000021
representing the average value of the download duration of all data sources in one download task;
s103, comprehensively comparing and analyzing the values of the measurement indexes obtained through statistics by the cloud, and summarizing and concluding downloading schemes suitable for downloading with different characteristics;
s104, the cloud end summarizes the downloading schemes with different characteristics and sends the downloading schemes to the client end;
s105, the client collects and analyzes data in the downloading process to obtain the current downloading performance;
s106, the client takes the current downloading performance and the characteristics of the downloaded file as characteristic information of the current downloading, and selects an optimal downloading scheme to adjust so as to optimize the performance of multi-source downloading.
2. The multi-source download performance optimization method according to claim 1, wherein the step S103 comprises: comparing three measurement indexes of average speed, downloading success rate and diversity of data source participation time under the condition of using different numbers of data sources and different file sizes, and analyzing to obtain the upper limit of the number of the data sources required to be used in the downloading of the different file sizes;
and comparing three measurement indexes of average speed, downloading success rate and downloading abandon rate of multi-source downloading under the conditions of data source charging and free, and analyzing to obtain a downloading strategy adopted by the charging file, namely preferentially using the free data source for downloading.
3. The multi-source download performance optimization method according to claim 2, wherein the step S105 comprises: acquiring the size of a file in real-time downloading, the number of currently used data sources, the downloading time length and the corresponding number of bytes of each data source, and the resource type used by each data source, namely charging or free; and calculating to obtain the real-time downloading Average Speed (AS) of each data source.
4. The multi-source download performance optimization method according to claim 3, wherein the step S106 comprises: if the number of the currently used data sources is less than or equal to the upper limit of the number of the data sources and the used data sources are free resources, maintaining the current download link;
if the used data source has the charged resource, the charged resource is replaced by the free resource, and if the free resource is not available, the charged resource is continuously used for downloading;
and if the number of the currently used data sources is larger than the upper limit of the number of the data sources, interrupting the data sources with poor quality, namely the data sources with low average downloading speed.
5. A multi-source download performance optimizing apparatus using the multi-source download performance optimizing method according to claim 1, the apparatus comprising:
the log data analysis module is used for sorting the existing log data through the cloud end and classifying the log data according to the number of data sources used in downloading to find out all logs downloaded by using the same number of data sources;
the classified data statistics module is used for carrying out comparative analysis on various classified log data through the cloud end, and carrying out statistics to obtain the maximum value, the minimum value, the median value and the mean value of various measurement indexes with different file sizes in each type of data;
the characteristic scheme association module is used for comprehensively comparing and analyzing the values of the measurement indexes obtained through statistics through the cloud end, and summarizing and concluding downloading schemes suitable for downloading of different characteristics;
the data transmission module is used for summarizing the downloading schemes with different characteristics and suitable for downloading through the cloud and then sending the summarized downloading schemes to the client;
the real-time data analysis module is used for collecting and analyzing data in the downloading process of the client to obtain the current downloading performance;
and the download scheme selection module is used for selecting the optimal download scheme to adjust so as to optimize the multi-source download performance by taking the current download performance and the characteristics of the downloaded file as the characteristic information of the current download.
CN201710031275.5A 2017-01-17 2017-01-17 Multi-source download performance optimization method and device Expired - Fee Related CN106850778B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710031275.5A CN106850778B (en) 2017-01-17 2017-01-17 Multi-source download performance optimization method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710031275.5A CN106850778B (en) 2017-01-17 2017-01-17 Multi-source download performance optimization method and device

Publications (2)

Publication Number Publication Date
CN106850778A CN106850778A (en) 2017-06-13
CN106850778B true CN106850778B (en) 2020-10-23

Family

ID=59123576

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710031275.5A Expired - Fee Related CN106850778B (en) 2017-01-17 2017-01-17 Multi-source download performance optimization method and device

Country Status (1)

Country Link
CN (1) CN106850778B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108111844B (en) * 2017-12-27 2019-11-15 北京奇艺世纪科技有限公司 A kind of monitoring method of Video service quality, apparatus and system
CN117354374B (en) * 2023-12-06 2024-02-13 广东车卫士信息科技有限公司 Data transmission method and system based on Internet of Things

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103152382A (en) * 2013-01-15 2013-06-12 中国科学技术大学苏州研究院 Multi-file simultaneous-transmission control method directed at multi-host network

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100169195A1 (en) * 2005-02-03 2010-07-01 Bernard Trest Preventing unauthorized distribution of content on computer networks
US20090280907A1 (en) * 2008-04-30 2009-11-12 Bally Gaming, Inc. Server client network throttling system for download content
CN102025595A (en) * 2009-09-22 2011-04-20 常诚 Flow optimization method and system
CN101764807B (en) * 2009-12-16 2012-09-05 北京邮电大学 Multisource internet resource device and method based on meta search engine
CN102855238A (en) * 2011-06-28 2013-01-02 腾讯科技(深圳)有限公司 Method and system for downloading resource data
CN105208059B (en) * 2014-06-19 2019-09-17 腾讯科技(深圳)有限公司 A kind of content distribution method, terminal, server and system
CN105577830A (en) * 2016-02-02 2016-05-11 明博教育科技股份有限公司 Optimal selection method and system for downloading list based on statistics

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103152382A (en) * 2013-01-15 2013-06-12 中国科学技术大学苏州研究院 Multi-file simultaneous-transmission control method directed at multi-host network

Also Published As

Publication number Publication date
CN106850778A (en) 2017-06-13

Similar Documents

Publication Publication Date Title
US10027739B1 (en) Performance-based content delivery
US9769248B1 (en) Performance-based content delivery
US9456014B2 (en) Dynamic workload balancing for real-time stream data analytics
Da Silva et al. Identification and selection of flow features for accurate traffic classification in SDN
US11108676B2 (en) Method and system for detecting network quality based on a network fluctuation model
US9491225B2 (en) Offline download method and system
Ma et al. A content distribution system based on sparse linear network coding
CN106850778B (en) Multi-source download performance optimization method and device
CN103945198A (en) System and method for controlling streaming media route of video monitoring system
CN113259256B (en) Repeating data packet filtering method and system and readable storage medium
CN112949739A (en) Information transmission scheduling method and system based on intelligent traffic classification
CN110324260B (en) Network function virtualization intelligent scheduling method based on flow identification
US9954921B2 (en) Rate-adaptive data stream management system and method for controlling the same
CN104679590A (en) Map optimization method and device in distributive calculating system
Karamshuk et al. ISP-friendly peer-assisted on-demand streaming of long duration content in BBC iPlayer
CN113098724A (en) Server tuning method, system and device
CN111163006B (en) Multipath preferred online game acceleration method based on waveform judgment
US8984100B2 (en) Data downloading method, terminal, server, and system
CN106547683A (en) A kind of redundant code detection method and device
CN108512921A (en) Document down loading method, electronic equipment based on P2P networks and storage medium
Huang et al. Application identification system for SDN QoS based on machine learning and DNS responses
Li et al. High performance flow feature extraction with multi-core processors
US20160294940A1 (en) Data download method and device
Hayashi et al. P2PTV traffic classification and its characteristic analysis using machine learning
CN110971536A (en) Outbound load balancing implementation method based on P2P flow

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20201023

CF01 Termination of patent right due to non-payment of annual fee