WO2017161124A1

WO2017161124A1 - System for video streaming using delay-aware fountain codes

Info

Publication number: WO2017161124A1
Application number: PCT/US2017/022719
Authority: WO
Inventors: Dapeng Oliver Wu; Kairan SUN; Qiuyuan Huang
Original assignee: University Of Florida Research Foundation, Incorporated
Priority date: 2016-03-16
Filing date: 2017-03-16
Publication date: 2017-09-21

Abstract

A network system for increasing data throughput and decreasing transmission delay along a data link from a source node to a sink node, the network system comprising: a first node configured to: encode a plurality of video data packets using rateless coding based on at least one video encoding characteristic of a video source of the plurality of video data packets by adaptively adjusting at least one parameter of the rateless coding based on the at least one video encoding characteristic, and transmit, over the data link to at least one second node, the plurality of video data packets.

Description

SYSTEM FOR VIDEO STREAMING USING DELAY- AWARE FOUNTAIN CODES

RELATED APPLICATIONS

This application claims priority to and the benefit of U.S. Provisional Patent

Application Number 62/309,023, entitled "SYSTEM FOR VIDEO STREAMING USING DELAY-AWARE FOUNTAIN CODES," filed March 16, 2016. The entire contents of the foregoing are hereby incorporated herein by reference.

BACKGROUND

In the recent decade, we have witnessed the bloom of video services over the Internet.

Some of them provide pre-recorded video streams such as YouTube and Netflix; others provide live video communications such as Skype and Facetime. As expected, a huge amount of multimedia content will be generated and consumed. On the other hand, the prevalent smart mobile devices make this content more accessible to people than ever. Thanks to the evolution of communication technologies, such as 3G/4G, LTE, WiFi, etc., wireless networks are widely available in our daily lives. However, despite the promising developments, the stochastic nature of wireless channels still persists: its vulnerability to channel noise, inter-user interference and low data rate under mobility. The problems easily deteriorate in

video-dominant applications where the requirement on channel quality is the highest. As a result, how to stream videos with low delay, stable data rate, and high quality over wireless network raises formidable challenges.

To provide reliable video transmission, advanced coding and signal processing techniques, such as forward error correction (FEC) erasure codes, have been proposed. One important class of FEC codes are fountain codes, such as Luby transform (LT) code and Raptor code. Fountain codes are effective for wireless video streaming for its ratelessness: fountain codes will reconstruct the original data using the redundancy sent by the sender, without demanding acknowledgments (ACK) or retransmissions. The traditional fountain codes are initially designed for achieving the complete decoding of the entire original file. That means that if a video file is transmitted using traditional fountain codes, users may not watch it until the whole video file is successfully decoded. However, some video streaming applications are delay-aware and loss-tolerant, which means (i) the time interval between video being generated and being played may not exceed a certain threshold; and (ii) partial decoding is tolerable, albeit higher decoding ratio is still preferred. SUMMARY

Some aspects include a network system for increasing data throughput and decreasing transmission delay along a data link from a source node to a sink node. The network system may comprise: a first node configured to: encode a plurality of video data packets using rateless coding based on at least one video encoding characteristic of a video source of the plurality of video data packets by adaptively adjusting at least one parameter of the rateless coding based on the at least one video encoding characteristic, and transmit, over the data link to at least one second node, the plurality of video data packets.

Further aspects include at least one computer-readable storage medium encoded with executable instructions that, when executed by at least one processor, cause the at least one processor to perform a method for transmitting a video stream over a data link from a source node to a sink node via a relay node. The method may comprise: encoding a plurality of video data packets using rateless coding based on at least one video encoding characteristic of a video source of the plurality of video data packets by adaptively adjusting at least one parameter of the rateless coding based on the at least one video encoding characteristic; and transmitting, over the data link, the plurality of video data packets.

BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram illustrating an exemplary source node, relay nodes, and a sink (or destination) node of a network in which some embodiments of the application may be implemented.

FIG. 2 is a flowchart of an exemplary method of increasing data throughput and decreasing transmission delay along a data link from a source node to a sink node according to some embodiments.

FIG. 3 is a flowchart of an additional exemplary method of increasing data throughput and decreasing transmission delay along a data link from a source node to a sink node according to some embodiments.

FIG. 4 is a set of diagrams illustrating a comparison of coding structure between (a) block coding scheme and (b) sliding window scheme according to some embodiments.

FIG. 5 is an additional set of diagrams illustrating a comparison between (a) block coding scheme and (b) sliding window scheme according to some embodiments.

FIG. 6 is a set of charts illustrating comparisons of accumulated sampling probabilities using different sliding window schemes and optimization strategies according to some embodiments. FIG. 7 is a chart illustrating an exemplary optimization result according to some embodiments.

FIG. 8 is a set of diagrams illustrating exemplary sampling distributions according to some embodiments.

FIG. 9 is a chart illustrating exemplary distribution functions for different slope factors according to some embodiments.

FIG. 10 is a chart illustrating an exemplary sampling distribution according to some embodiments.

FIG. 11 is a chart illustrating an exemplary result of using exemplary sampling distributions according to some embodiments.

FIG. 12 is a diagram illustrating exemplary warm-up and cool-down periods according to some embodiments.

FIG. 13 is a diagram illustrating an exemplary header structure of a data packet according to some embodiments.

FIG. 14 is a flowchart of an exemplary encoder according to some embodiments.

FIG. 15 is a flowchart of an exemplary decoder according to some embodiments.

FIG. 16 is a chart illustrating exemplary experimental results according to some embodiments.

FIG. 17 is a chart illustrating additional exemplary experimental results according to some embodiments.

FIG. 18 is a chart illustrating a comparison of exemplary fountain code schemes according to some embodiments.

FIG. 19 is a diagram illustrating a computer system on which some embodiments of the invention may be implemented.

DETAILED DESCRIPTION

As the inventors have recognized and appreciated, delay awareness may be introduced into fountain codes by partitioning the video file into fixed-length data blocks, separately encoding them, and transmitting them sequentially. This method may be called a block coding scheme. From the perspective of video transmission, a smaller block size is preferred, because it leads to shorter playback latency. From the perspective of fountain codes, however, the inventors have recognized and appreciated that the block size needs to be as big as possible to maintain a smaller coding overhead; the fundamental trade-off between video watching experience and coding performance is crucial for the design of delay-aware fountain codes. The inventors have recognized and appreciated that a delay-aware fountain code scheme for video streaming may be provided by deeply integrating channel coding and video coding. Some embodiments may be based on sliding window fountain codes (SWFC), which partitions a file into many overlapping data windows and transmits them sequentially. The inventors have recognized and appreciated that other joint fountain- and video-coding designs have not fully exploited the characteristics of multimedia content in order to improve the video watching experience.

The inventors have recognized and appreciated that treating the sliding windows as non-homogeneous (e.g., having varying characteristics) can improve the video watching experience. In some embodiments, the sliding windows may have variable length and sampling distributions, rather than having fixed length and uniform sampling distribution. In some embodiments, each window may be processed individually, which may provide a deep-level joint design of multimedia streaming and channel coding. Additionally, some embodiments may take into account video bit rate fluctuation and video coding parameters, such as group of pictures (GOP) size and frame rate, at the level of channel coding, and may exploit these in the design of the coding approach. The inventors have recognized and appreciated that, as a result, some embodiments may take advantage of all the benefits of fountain codes and improve upon them in the context of delay-aware video applications. The inventors have recognized and appreciated that a performance metric herein referred to as in-time decoding ratio may better reflect the real video watching experience and show that some embodiments significantly outperform existing encoding schemes.

In some embodiments, a time-based sliding window scheme may provide much desired delay awareness in video streaming. Unlike the existing SWFC schemes that have a fixed number of packets in each window, some embodiments may adaptively select window lengths according to the number of bits in frames. In this way, some embodiments can maximize the code word length in the coding blocks, so as to achieve higher coding gain within a bounded playback delay.

In some embodiments, a modified window-wise sampling strategy may provide a consistent watching experience. The inventors have recognized and appreciated that existing SWFCs uniformly sample and encode the packets within each window, which due to video bit rate fluctuation, causes the received video quality to be time-varying. The inventors have recognized and appreciated that by adjusting the sampling pattern according to the ongoing video bit rate, significantly higher decoding ratio than existing schemes may be achieved. The inventors have recognized and appreciated that integrating the above mentioned techniques into what is referred to herein as a Delay- Aware Fountain codes protocol (DAF) may deliver significant improvements to the video watching experience. The inventors have recognized and appreciated that computational complexity may be reduced using

lower-complexity version of DAF herein called DAF-L. Based on comparisons with conventional counterparts in various scenarios, some embodiments may yield the best overall performance.

Implementation of the System

FIG. 1 is a diagram illustrating a system 100 that may employ techniques for increasing data throughput and decreasing transmission delay from a source node to a sink node via a relay node as described herein. In the example of FIG. 1, a source node 110 (which may be referred to as first node) may encode data packets for transmission. In some embodiments, the source node 110 may encode the data packets using fountain coding (as illustrated at stage 210 of FIG. 2). However, any suitable coding, including rateless coding, may be used to encode the data packets. The source node 110 may also transmit the data packets to a first relay node 130 via connection 120 (as illustrated at stage 220 of FIG. 2), which may be a wireless connection. However, any suitable connection or communication technology may be used to communicate among the nodes.

The first relay node 130 may receive at least one of the data packets from the source node 110. In addition, the first relay node 130 may relay or transmit the data packets to a second relay node 150 via connection 140, which may be a wireless connection. The second relay node 150 may receive at least one of the data packets from the first relay node 130. In addition, the second relay node 150 may relay or transmit the data packets to a sink node 170 via connection 160, which may be a wireless connection. In some embodiments, source node 110 may be a server, such as a streaming video server. Alternatively or additionally, source node 110 may include a network ingress point, such as a gateway to a wireless network (e.g., a base station in a wireless network).

Additionally, sink node 170 may be a client, such as a video receiver and/or playback device. For example, sink node 170 may be a client of the server referred to as source node 110. Sink node 170 may be a wireless device, such as a smartphone, tablet, laptop computer, or desktop computer.

Alternatively or additionally, relay nodes, such as first relay node 130 and/or second relay node 150, may include network routers and/or network switches. In some embodiments, relay nodes, such as first relay node 130 and/or second relay node 150, may include hubs, and/or any other suitable components. Alternatively or additionally, relay nodes may include other cell transceivers in a cellular network, such as a 5G network. In some embodiments, the first relay node 130 and/or the second relay node 150 may regenerate, re-encode, and relay the data packets conditionally, based on the quantity of the data packets received at the given relay node. For example, the first relay node 130 and/or the second relay node 150 may receive a subset of the data packets, and based on the subset of the data packets, the first relay node 130 and/or the second relay node 150 may regenerate the data packets, re-encode the regenerated data packets, and transmit the regenerated, re-encoded data packets.

The sink node 170 may receive one or more data packets from the second relay node 150. If the sink node 170 has received a sufficient quantity of the data packets, the sink node 170 may regenerate and decode the data packets. FIG. 1 shows only two relay nodes, the first relay node 130 and the second relay node

150. This number of relay nodes is shown for simplicity of illustration. It should be appreciated that a network system may have many more nodes and relay nodes.

In some embodiments, source node 110 may encode a plurality of video data packets using rateless coding (as described above) based on at least one video encoding characteristic of a video source of the video data packets (as illustrated at stage 210 of FIG. 2) by adaptively adjusting at least one parameter of the rateless coding based on the at least one video encoding characteristic (as illustrated at stage 215 of FIG. 2). A video encoding characteristic may include any characteristic or property of how a video source is encoded, such as frame rate, number of frames in a group of pictures, video bit rate, and so on. Additionally, source node 110 may transmit the video data packets over a data link to a second node, which may be first relay node 130, second relay node 150, and/or sink node 170 (as illustrated at stage 220 of FIG. 2). For example, the second node may be a sink node configured to receive one or more of the plurality of video data packets from the streaming video server via at least one relay node, or at least one relay node configured to receive at least one of the plurality of video data packets from the streaming video server. In some embodiments, the data link is at least partially wireless.

In some embodiments, the rateless coding may comprise fountain coding. Alternatively or additionally, the video source may include video data, such as a video file. Alternatively or additionally, at least one video data packet of the plurality of video data packets comprises at least 100 bits, although any number of bits may be used.

In some embodiments, the plurality of overlapping sliding data windows may be non-homogeneous, as discussed herein. For example, the plurality of overlapping sliding data windows may collectively have more than one length and/or more than one sampling distribution.

In some embodiments, source node 110 may obtain the at least one video encoding characteristic by preprocessing the video source (as illustrated at stage 205 of FIG. 3).

Additionally, adaptively adjusting the at least one parameter of the rateless coding may comprise adaptively selecting a first length of a first data window and a second length of a second data window of a plurality of overlapping sliding data windows (as illustrated at stage

216 of FIG. 3). In some embodiments, the selecting may be based on a number of bits in frames of the video source and/or on a first number of frames in the video source. Additionally, the first length may comprise a second number of frames that the first data window can accommodate, and the second length may comprise a third number of frames that the second data window can accommodate.

In some embodiments, source node 110 may store the first length of the first data window in a header of a first packet of the first data window and may store the second length of the second data window in a header of a second packet of the second data window. In addition, source node 110 may store the first sampling distribution in the header of the first packet of the first data window as a first slope factor and may store the second sampling distribution in the header of the second packet of the second data window as a second slope factor.

In some embodiments, adaptively adjusting the at least one parameter of the rateless coding may comprise adaptively adjusting a first sampling distribution for the first data window and a second sampling distribution for the second data window (as illustrated at stage

217 of FIG. 3). Additionally, the adjusting may be based on a video bit rate of the video source. In addition, the video bit rate may be variable.

In some embodiments, source node 110 may segment data from the frames of the video source into the plurality of video data packets based on at least the first length and the second length (as illustrated at stage 218 of FIG. 3).

In some embodiments, the at least one video encoding characteristic may comprise a frame rate of the video source, a number of frames in a group of pictures of the video source, and/or a video bit rate of the video source. Additionally, the video bit rate may be variable.

Delay-aware Sliding Window Fountain Codes

Concepts in both fountain codes and video coding will be frequently referred to herein, and so two sets of variables in are defined in Table I and II. Table I lists the variables related to fountain codes, and Table II lists the properties related to video coding.

For notation simplicity, all the concepts relating to "time" herein are actually in the unit of "number of frames". The total number of native packets k may be expressed as

The definitions of the other variables will be introduced when they are used later herein. Sliding Window vs. Block Coding

As shown in FIG. 4a, the block coding scheme may have a relatively small block size, and the coded packets for each block are only linked to the source packets in a small window. But in FIG. 7b, the overlap between sliding windows makes decoded packets in one window to help the decoding of other windows. In that sense, the size of the window is virtually extended. As a result, sliding window schemes virtually extend the block size, so as to enhance the performance of the fountain codes by reducing the overhead.

The inventors have recognized and appreciated that an expanding window fountain code scheme in which, instead of using the overlapping fixed-size windows, the packets in each window must be a subset of the next window, may not be suitable for video streaming, since the decoding probability is unbalanced: the probability of decoding the frames in the beginning may be higher than the later ones. Neither that scheme nor a block coding scheme, with a virtual block size expanded by duplicating all symbols in each block, may provide careful insights to the relationship between block size and the end-to-end delay in delay-aware applications.

The inventors have recognized and appreciated that using an SWFC scheme may provide such insights. The step size between two consecutive windows may be Δt. For simplicity, W and T may be assumed to be integral multiples of Δt. In order to avoid dividing the frames from one GOP into different windows, Δt should be an integral multiple of N_GOP-

One noteworthy difference between SWFC and block coding is the relationship among Δt, 7" Del ,ay and window size W. For block coding ^σ, as shown in FIG. 5a, because the receiver can only start to play the content in current block when the transmission for this block finishes, and the sender can only start to encode and send the next block's packets when all the packets in the next block are available, the end-to-end play delay T_Delay≥2W. For SWFC, if the step size is Δt, the encoder can start to encode the next window as soon as next Δt packets are available, so the end-to-end play delay T_Delay≥W+Δt. The above relationships implicitly impose the maximal window size (which corresponds to the best coding efficiency) we can set for both schemes. If we deem block coding as a special case of sliding window when Δt=W, we can see the sliding window cannot exceed . We also know that the biggest window size is obtained when

Δt=l, as shown in FIG. 5b.

The number of windows to be sent, N wi .nd ,ow , could be obtained by

derived parameter is the number of coded packets to be sent within each sliding window, which is a link between fountain codes and video coding: Then, the total number of

coded - p¹ ackets to be sent, N, can be defined based on The

overall code rate C is then defined using

Time-based Window Size vs. Packet-based Window Size

In some embodiments, the size of windows may be based on time (or interchangeably speaking, based on the number of frames to be transmitted in a window, which may vary in size such that a given window may include a variable number of packets).

Although the specific methods may vary, some existing work designed the delay-aware fountain codes based on the following core idea: group the video data into windows (either overlapping or non-overlapping), and send the windows one by one within each period of time. In the aforementioned work, the coding parameters, such as the size of the windows, the speed of window movement, and the total length of the data, are constant numbers based on the number of packets. The inventors have recognized and appreciated that the number of packets may be an abstraction of the number of frames.

However, the inventors have recognized and appreciated that the packet-based schemes ignored an important characteristic of video data: there are different amounts of bits for each frame. Even if rate control techniques are used, they may inevitably lead to bit rate fluctuation and video quality degradation. This fact makes packet-based windows different from time-based windows. The inventors have recognized and appreciated that dividing the video streaming data into blocks with fixed number of packets may result in the following phenomena:

1. Improper partition of frames and GOPs: Because a frame may contain various number of packets, it is highly likely that packets from one frame, or one GOP, to fall into two blocks, thus causing video playback error. 2. Uncontrollable delay: Because there are different amounts of frames in each block, the time of delay varies from time to time, so the resulting time delay is unknown. As a result, packet-based window cannot be used in real-time or delay-aware systems. Even if we have to use packet-based window in delay-aware systems, we need to know the number of packets in each frame of the video before-hand, and select the fewest number of packets in any Γ frame period as the window size. In that case, the tolerable delay

is underused for most of the time, which contradicts the designing principle of

making the best use of delay.

3. Unstable data rate: Using fixed code rate, the encoder will generate the same amount of coded packets within any packet-based window. However, because of the nonuniformity of video bit rate, the data rate will be different in different time periods.

On the contrary, the inventors have recognized and appreciated that by using the time-based window size, all the above issues may be resolved: some embodiments may ensure one frame is grouped in a same window; some embodiments may make the best use of the delay at all times; if the encoder generates the same amount of coded packets within any time-based window, the data rate may be a constant.

Window-wise Sampling Strategy

Nonuniform Global Sampling Distribution Using SWFC

For an LT encoder, each coded packet may be generated using the following two steps: (i) Randomly choose the degree d_n of the packet from a degree distribution r(d); (ii) Choose d_n input packets at random from the original packets uniformly, and a coded packet is obtained as the exclusive-or (XOR) of those d_n packets. The inventors have recognized and appreciated that existing optimization procedures are all built on a common prerequisite - the sampling distribution should be uniform, because the highest efficiency of fountain codes is achieved using uniform distribution.

However, with the time-based sliding window, even if every window's sampling distribution is uniform, the overall sampling distribution may still be nonuniform. The reason is that the number of packets might be different for different frames. For example, as shown in FIG. 6a, the bit rate is not constant for the CIF video sequence foreman. If the video is segmented into 20-frame blocks and the data rate of block coding fountain code is constant, however, as shown as the black line in FIG. 6b, the fluctuation of sampling probabilities is very huge. It is easy to understand that the probability is inversely proportional to the number of bits in that video block.

In SWFC scheme, instead of being related to only one window, the sampling probability of a frame is related to all the windows that covers it, as shown in (1).

pkt

where ρ_ω (t) denotes the average sampling probability of each packet in frame t within the window ω. So, P{t) denotes the total probability of every packet in frame t accumulated through all the sliding windows covering that frame, called accumulated sampling probability, or ASP herein. Here, because we assume that the bits within each frame have equal importance, it is assumed that all the packets in one frame have a same sampling probability, which leads to

where denotes the total probability of the packets in frame t to be sampled, within the

window ω. s(f) is the number of packets in frame t as defined in Table II.

For example, using a uniform-distribution sliding window (window size W=20, step size Δί=5) to slide through the video sequence foreman, we can obtain the ASP as shown as the red line in FIG. 6b. The ASPs shown here are normalized, so the average value of all the probabilities in one scheme is normalized to 1. Because the ASP forms non-uniform distribution, its coding efficiency may be considered low for fountain code.

The inventors have recognized and appreciated that, fortunately, the overlapping property of the sliding window provides a way to stabilize the ASP: the sampling probabilities within each window can be assigned unequally to achieve the overall uniformity of the ASP.

Although selecting the best sampling distribution for each window is an optimization problem, we can still intuitively understand it as follows: if the vicinity of a window has relatively low bit rate and it will get higher in the future, in order to make the overall sampling distribution as homogeneous as possible, we do not want to "waste" the sampling opportunities on the imminent frames, which are already sampled in previous windows for too many times; instead, the encoder should sample more from the future side of the window, such that it could compensate the low sampling probability of the upcoming high bit rate frames. To give a glimpse of what some embodiments of the window-wise sampling strategy can do, FIG. 6b shows the resulting ASPs using different optimization strategies. The inventors have recognized and appreciated that that they are significantly more stable than conventional schemes. Per-frame Optimization Scheme

In order to optimize the sampling distributions for all windows, the video length T, window size W and the number of packets in each frame s(t) (or its vector form s= [s(l) s(2)

... s(T)]) may need to be known. We define to denote the probability of sampling the

packets in the i^th frame of the window starting from the 1^th frame. As in (2), the sampling probability for each packet in that frame within the window, is defined as (3).

As in (1), the ASP for the 1^th frame is defined as (4).

For simplicity, this accumulation process may not consider the step sizes of Δt other than 1. Because both the video length T and window size Ware defined to be integral multiples of Δt, if

Δζ>1, all the parameters can be down-sampled by a factor of Δt. For example, the new

and make a matrix from them. We get the parameter matrix A as in (5).

The number of rows is W because each window may have W sampling probabilities. The number of columns is T-W+l because there are (T-W+l) windows in total (again, Δt is assumed to be 1). Because every row in the matrix represents the probability distribution within a window, the elements in A must satisfy the constraints of (6).

With this notation, (4) can be rewritten into a parameterized form as (7).

The objective is to find the optimal parameter matrix A, which may minimize the fluctuation of the ASPs Because in this problem, the parameters to be optimized are the

sampling probabilities for each frame of every window, this method may be called the per-frame optimization scheme.

Given the total number of frames T, the window size W, and number of packets in each frame s(t), we want to find a set of parameters as in (5), for which the mean square error of the sampling probabilities of all packets attains its minimum value. The optimization problem is defined in (8).

It should be noted that the range of frames we want to stabilize is from W to W-T+l .

Because the frames in that range are all covered by exactly W sliding windows, they are deemed as stable frames. On the other hand, the frames before W or after W-T+l are covered by less than W sliding windows, so they are considered to be warm-up/cool-down frames, and not be counted as the targets of the optimization.

If the conditions of are ignored, this optimization problem can be solved using

Lagrange multiplier. Otherwise, it can be solved by Karush-Kuhn-Tucker (KKT) conditions.

An example of the optimization result is shown in FIG. 7. It is the optimization result of sampling distributions for each window of CIF sequence foreman using per-frame optimization scheme. Window size W=20 and step size Δί=5. Because there are too many windows to be clearly shown in one figure, only a fraction of the windows is presented here. The probabilities are normalized. The trend of the bit rate, which is represented by dashed line, is also plotted in FIG. 7 in order to indicate the relationship between bit rate and optimization results. The straight solid line in FIG. 7 shows the resulting ASP using this per-frame optimization strategy. Because there are variables to optimize and conditions for

Lagrange multiplier (if using KKT conditions, there are conditions), the

optimization process yields the system of equations with equations (or

equations in KKT conditions). Assuming that if we omit

constant factors and lower order terms, the solution of both KKT conditions and Lagrange multiplier involves the generation of a parameter matrix of and the computation of

its inverse matrix. As a result, the computational complexity is

Slope-only Optimization Scheme

Although optimizing the sampling distribution for each frame within every window yields the most optimal solution in terms of minimizing the fluctuation of sampling probabilities between frames, less computationally intensive approaches may be applied in some embodiments. For example, with per- frame optimization, there are too many parameters to be optimized in some embodiments. The computational complexity is which is too

high for large T or W. Secondly, in order to reconstruct the coded packets on the decoder side, the encoder must tell the receiver what sampling distribution is used in each window, by explicitly including every frame's sampling probability in the packet header. That will introduce overhead in the packet header. Since bigger packets are more vulnerable to channel noise, including too much information in headers will increase packet loss rate in wireless networks. As a result, a more concise description for the sampling distributions may be needed for the practical designs, so they can be obtained with lower computational complexity, and be transmitted in the headers with less bits.

The inventors have recognized and appreciated a slope-only description for the sampling distributions. It requires only one parameter - slope factor, denoted as a, to control the shape of the distribution. However, the inventors have recognized and appreciated that using less bits will inevitably lose the precision of describing the sampling distributions. Therefore, compared to the optimal performance that can achieved by using per-frame description, slope-only description may result in suboptimal performance.

The slope factor is a real number, and it ranges from -1 to 1. The distribution functions are defined to be linear functions, and the slope factor only controls the slopes of them. The inventors have recognized and appreciated that it is undesirable for any of the packet's probability to be 0, because in that case, the effective window size will shrink. As a result, we define that when the slope factor a=l, the distribution function of the packets starts from 0 and increases linearly, forming a forward triangular distribution, as shown in the top of Fig. 5; when the slope factor a=0, the distribution function is a uniform distribution, as shown in the second line of Fig. 5; when the slope factor a=-l, the distribution function is the reverse of that in (3=1, or a backward triangular distribution, as shown in the third line of Fig. 5. Therefore, the distribution functions of all the slope factor values in the middle are continuously defined.

As defined in Table I, for time t in video sequence, the number of packets in the window is For each window, a linear distribution function can be defined over the

interval Because the integration of the function in must be 1, we can get

the definitions of the lines for different slope factors a. When slope factor a=l, it passes points

(0,0) and , as the positive-slope line shown in FIG. 9; when slope factor a=-l,

it passes point , as the negative- slope line shown in FIG. 9. The

lines for all slope factors will always pass the point . As a result, the

distribution function given any a and t is (9).

As stated herein, the sampling probabilities of the packets within a same frame should be the same. As a result, the probabilities of sampling one frame should be grouped together, and the actual sampling probability of each packet should be the average value of all packets in its frame. As the example shown in FIG. 10, there are four frames, each of which contains 3, 4, 2, and 2 packets respectively. The actual sampling probability for each packet is the average value of the packets in the interval of its frame. Given slope factor a, the probability of sampling the i^th frame in the window starting from the 1^th frame, denoted as is then

defined as (10).

where pkt(t,i) is defined in Table II. The second equality holds because the distribution function is a linear function, and the average value is taken at the middle point of each interval.

As in (2), given slope factor a, the probability of each packet to be sampled in the

frame within the 1^th window, denoted as

is the average value of the distribution function in the interval

which is defined in (11).

D

As in (1), the ASP for each packet in frame t, denoted as P (t) is defined in (12).

where a denotes the set of slope factors for all windows in the video sequence (from frame 1 to frame (T-W+l) ) as in (13). Again, for simplicity, the accumulation process does not consider the step sizes of Δt other than 1.

We can rewrite (12) for clearer notations as in (14).

where "^■" denotes the dot product of the two vectors of (T-W+l) elements, and

From (15) we can see that d₁ and d₂ are only relevant to s, W, and t, but not influenced by a.

With the equations defined above, we can describe the optimization problem as follows.

Exemplary Approach to Determining Slope

Given the total number of frames Γ, the window size W, and number of packets in each frame s(t), we want to find a set of slope factors as in (13), for which the mean square error of the sampling probabilities of all packets attains its minimum value. The optimization problem is defined in (16).

wher The range of frames we want to stabilize is also from Wto

0

W-T+l, for the same reason as stated in the per- frame condition.

As in the per- frame scheme, the present problem can be solved by KKT conditions. The result of each window's sampling distribution using slope-only optimization is shown in FIG. 11 with the same settings as in FIG. 7. The negative-slope line in FIG. 6b shows the resulting ASP using this slope-only optimization strategy. The inventors have recognized and appreciated that, in terms of stability of ASP, slope-only scheme yields worse result than per-frame scheme. Because there are variables to optimize and conditions for KKT

conditions, the optimization process yields the system of equations with

equations. Assuming that T» W>>Δt, if we omit constant factors and lower order terms, the solution involves the generation of a parameter matrix o and the computation of its

inverse matrix. As a result, the computational complexity is Compared to that of

per-frame scheme, the computational complexity of slope-only scheme is lowered by the factor of (W/Δt)³.

Exemplary System Design

Because of the reasons described above, DAF protocol may be designed based on the slope-only description and optimization scheme.

Warm-up/Cool-down Period

If the window always contains W frames, and it slides from the I^st frame to the T^h frame at the speed of Δί=1 , the inventors have recognized and appreciated that all the frames will be covered in W windows, except for the first W-l frames and the last W-l frames. Namely, the first and the last i^th frame will be covered in i windows when i≤W-l. We call those two periods as warm-up and cool-down periods (W/CP), as illustrated in FIG. 12, since they are undersampled and yield unstable decoding ratio.

In some embodiments, before the actual SWFC begins, both encoder and decoder may obtain the length of W/CP. The encoder will fill these two periods with padding characters, and the decoder will do the same and automatically mark those packets as decoded. Then, the SWFC may be performed. Also, for the sake of fairness, the inventors have recognized and appreciated that the pseudo-decoded padding packets in W/CP should not be counted as being decoded in evaluation, since they do not contain any useful information.

Packet Structure

The structure of a DAF packet header according to some embodiments is shown in FIG. 13. The pay load of a DAF packet may be coded, and its length may be given in the header. The total size of the header may be 15 bytes. The header may include the starting packet position of the window (StartP), the size of current window in the unit of number of packets (WSize), the slope factor used in current window (SlopeF), packet ID (PacketID), and packet size (P).

The data length of SlopeF determines the precision of the slope factors used in generating sampling distribution. In our protocol design, we use 4 bytes as the length of it, which stores a real number as the float type in C++. PacketID starts from 1, and will be increased by 1 every time a coded packet is sent. It serves the similar purpose as in fountain code, which is the random seed for generating degrees and sampling packets.

If a user does not need the sampling distribution optimization due to limited computational power, SlopeF field can be set to 0, which means uniform distribution, to have the low complexity version of DAF (DAF-L).

DAF Encoder

The system design of DAF encoder according to some embodiments is shown as a flowchart in FIG. 14. Such a system may be implemented in an ASIC, an FPGA, or a similar integrated circuit chip. Alternatively or additionally, the system may be implemented as a driver controlling network interface hardware or in any other suitable way. Beforehand, the coding parameters, degree distributions and W/CP may already be obtained by the encoder. The system may take two sets of input: the parameters assigned by user

and the video source.

The video source may feed the system with streamed video data, and it may be first processed by the video preprocessing module, as shown in the dotted box. This module may get the information such as F, s, and may optimize the slope factors a. It may also segment

the data from each frame (or GOP) in to several P-byte packets, and may pad the insufficient packets to P bytes. It may put the segmented video packets in the packet buffer.

The middle row of the flowchart describes the encoding algorithm of DAF system according to some embodiments. After the procedure is triggered by the timer, the scheduler may determine whether to move the window to the next position, according to the parameters and the current status. If not, StartP, WSize, and SlopeF may remain the same as last sent packet; if the window slides, le

In both cases, let PacketID=PacketID+l.

In the next step, a degree d is chosen according to the degree distribution, like that in LT codes. Then, d packets are sampled from the packet buffer in the range confined by StartP and WSize. Each window's packet-wise sampling distribution may be generated by (11), given SlopeF. The bit- wise XOR of these d original packets is obtained as the payload of current coded packet.

At last, the parameters and the payload are assembled as an APP layer packet, according to the structure shown in FIG. 13. The packet will be sent using UDP. Last but not least, the program will set the timer to trigger the procedure again according to the frequency of sending packets, which is determined by parameters such as F,P,R,C, etc.

DAF Decoder

The system design of DAF decoder according to some embodiments is shown as a flowchart in FIG. 15. Also, the coding parameters, degree distributions and W/CP are already obtained by the decoder. The procedure starts when a coded packet is received.

The decoding procedure is basically the reverse of the encoding procedure according to some embodiments. Having StartP, WSize, SlopeF, and PacketID, the degree d, the sampling distribution and the composition of the coded packet can be reconstructed. They may be fed into a belief propagation (BP) decoder, which tries to decode the original packets. The decoded packets may be stored in the packet buffer.

The video playback module may request packets from the packet buffer as the time goes. First, the packets may be re-assembled into frames (or GOPs). If a packet has not been decoded yet when it is requested, it may be considered as a packet loss. If this happens, image processing techniques such as error concealment may be performed to fix it before playing it. General Framework of Fountain Code Systems

The inventors have recognized and appreciated that some embodiments of a DAF system may implement different fountain code schemes by changing settings and modules in the DAF system, but the protocol may not need to be changed.

For example, if a user does not need the sampling distribution optimization due to limited computational power, SlopeF can be set to 0, which means uniform distribution, to have a low complexity version of DAF (DAF-L); the original fountain code can be viewed as a special case when W=T, and let the timer continually send the coded packets until an ACK is received; the block coding schemes can also be viewed as special case when Δt=W; furthermore, the sliding window schemes with packet-based window size, may be a special case with fixed WSize; finally, expanding window can be viewed as another special case if we modify the scheduler of the encoder, by fixing StartP.

As a result, some embodiments of the proposed system may enjoy the flexibility to meet different requirements. Examples based on Simulation Experiments and Performance Evaluation Simulator Setup

We conduct the simulation experiments on Common Open Research Emulator (CORE) and Extendable Mobile Ad-hoc Network Emulator (EMANE). The former provides virtualization on application (APP), transport (UDP or TCP) and network (IP) layer controlled by a graphical user interface, and the latter provide high-fidelity simulation for link (MAC) layer and physical (PHY) layer. The working environment is set up on Oracle VM VirtualBox virtual machines.

We use CORE to emulate the topology of the virtual network and the relay nodes. Two VMs are connected to the virtual network as a source (or encoder/sender) node running the client application, and a destination (or decoder/receiver) node running the server application. A video is streamed from client to server using different schemes.

EMANE is used for emulation of IEEE 802.11b on PHY and MAC layer of each wireless node. Because of the forward error correction (FEC) nature of fountain code, we disable the retransmission mechanism of 802.1 lb for all fountain-code-based schemes. For the simplicity of performance evaluation, we also disable the adaptive rate selection mechanism of 802.1 lb, and only allow the 11 Mbps data rate to be used. Ad-hoc On-Demand Distance Vector (AODV) protocol is used for routing. Performance Metric

We use packet decoding ratio to evaluate the performance of the schemes, since higher packet decoding ratio may imply higher visual quality of video. It is worth noting that the evaluation criteria of delay-aware multimedia streaming is different from the file transfer applications, and it is commonly overlooked by existing SWFC schemes. In delay-aware applications, if a packet is decoded after its playback time, it has to be counted as a packet loss for the video decoder, since the player does not rewind the video.

As a result, we introduce the metric of in-time decoding ratio (IDR), which only counts a decoded packet as "in-time" decoded when it is within the current window. Comparatively, file decoding ratio (FDR) means the percentage of total decoded packets after the complete coding session finishes. For SWFC schemes, it may alwaysbe that FDR≥IDR; for block coding, it may be that FDR=IDR.

Performance Evaluation

We conduct experiments for the following cases: (i) one hop with no node mobility (fixed topology) under various delay requirements and code rates, (ii) various number of hops with no node mobility (fixed topology) under fixed packet loss rate per hop, (iii) two hops with a moving relay node (dynamic topology). We implement six schemes for comparisons, which are abbreviated as follows:

1. DAF: the proposed delay-aware fountain code protocol as introduced in some embodiments herein.

2. DAF-L: the low complexity version of some embodiments of the DAF scheme. Some embodiments of DAF-L may be DAF without using the optimized window-wise sampling distribution, as proposed herein.

3. S-LT: a sliding window LT code.

4. Block: the block coding for fountain codes.

5. Expand: an expanding window scheme.

6. TCP: this scheme uses TCP protocol to stream video. In order to add delay awareness, the video file is also segmented into the blocks like in "Block" scheme, but they are sent using TCP. For the sake of fairness, the maximum data rate is limited to the same amount as required by the SWFC schemes.

All the five fountain-code-based schemes use the following parameter setting: the packet size P=1024 bytes; for degree distribution, let 5=0.02, c=0.4, so LT code can get good average performances. Several benchmark CIF test sequences are used for our evaluation. They are coded into H.264/AVC format using x264, encapsulated into ISO MP4 files using MP4Box, and streamified by mp4trace tool from EvalVid tool- set. The coding structure is IPPP, which contains only one I-frame (the first frame) and no B -frame, and all the rest are P-frames. Let N _p=l. For the sake of clarity, all the delays shown in the experiments are in the unit of seconds. Because the frame rate for all sequences are 30 frames per second, it is easy to convert the unit between seconds and number of frames. Denote by ^J C and 7" Del ,ay the code rate and the tolerable delay, respectively.

We conduct 20 experiments for each setting with different random seeds, and take the median value of them as the performance measure. Two results are shown for each set of experiment: in-time decoding ratio (IDR) and file decoding ratio (FDR).

Case 1: One hop with no node mobility

In this case, there are two nodes in the network: a source node and a destination node. The communication path from the source to the destination has one hop. The distance between the two nodes is carefully set so that the packets with 1024-byte payload will have 10% packet loss rate (PLR). Let Δί=1.

We use the CIF sequence foreman for the experiments. FIG. 16 shows the relations of IDR vs. C of CIF sequence foreman for different T_D . The results of four delays are shown:

0.5, 1, 1.5, and 1.83 seconds. FIG. 17 shows the relations of IDR vs. T_D of CIF sequence foreman for different C. The results of four code rates are shown: 1.0, 0.9, 0.85, and 0.75. Only partial results of "block" scheme are shown, because its values are too small to be maintained in the same scale as others.

We choose all the combinations of 7" Del ,ay e [0.8,1.8] and Ce [0.6,0.9] to conduct the experiments. There are two dimensions of variables, T_D and C, so the results of each scheme form a surface. FIG. 18 shows five surfaces of the schemes.

The numerical comparative results between different schemes with variant delays and code rates of sequences foreman are shown in Table III.

From the results above, we have the following observations:

• Among all schemes, DAF has the highest decoding ratio. As shown in FIG. 18, almost the entire surface of DAF is above the other schemes. The performance of DAF-L is lower than DAF, but higher than others. DAF outperforms DAF-L because the overall sampling distribution of DAF is more homogeneous. The proposed schemes improve the decoding ratio when coding resource is insufficient or tolerable delay is small.

• The performance of S-LT is lower than two proposed schemes, but higher than others. DAF and DAF-L outperform S-LT because their window size is bigger.

• If C is low enough or _D is large enough, the decoding ratios of all three

SWFC schemes converge to 100%. Correspondingly, if C is too high or _D is too small, their performances are equally bad, or DAF may be even worse than the DAF-L scheme. That is because when data rate is extremely limited, DAF makes all the frames unlikely to be decoded at the same time, while in DAF-L scheme, some frames with very low bit rate will be decoded. However, since in those scenarios the video decoding ratios are below 50%, which is too low to be properly viewed, they are not the cases of the most concern. • The decoding ^σ ratio of all the schemes is an increasing ^σ function of 7" Del ,ay , and also a decreasing function of C. That means larger delay and lower code rate lead to higher overall performance, which meets our expectation. Also, Table III shows that in order to obtain the decoding ^σ ratio at a certain level, we need to balance 7" Del ,ay and C. · TCP's performance is relatively low. The reason is that TCP is not suitable for wireless scenarios where PLR is high. The slow start, congestion avoidance phases and congestion control mechanisms lower its performance.

• Block scheme performs the poorest among all schemes. Since the blocks are too small (7^ /2) and non-overlapping, the coding overhead is very large.

· The above observations are true for both IDR and FDR. There is always

FDR≥IDR, as we pointed out herein. For TCP and Block schemes, there is FDR=IDR, because the frames prior to current window will never be decoded in the future.

• Although decoding ratios of DAF and DAF-L are high (90%-99%) compared to other schemes, it hardly reaches 100%, due to the limitations of LT code. Case 2: Various number of hops with no node mobility

The setup of this set of experiments is the following. The network consists of a source node, a destination node, and 0 or 1 or 2 relay nodes. All the nodes in the network form a chain topology from the source node to the destination node. The communication path from the source node to the destination node has 1 or 2 or 3 hops. All the nodes are immobile; hence the network topology is fixed. For all the experiments in Case 2, we set PLR=5% for each hop/link. Let Δί=1.

The IDR results of sequences mobile and akiyo are compared in Table IV, where N/A means the corresponding decoding ratio is below 10% and unable to recover any consecutive frames, making the actual decoding ratio insignificant.

We have the following observations:

• The relationship of DAF, DAF-L, and S-LT remains the same as in case 1 for different PLR: DAF achieves the highest decoding ratio among all schemes; DAF-L scheme is the second best; S-LT performs the worst among the three. That shows the proposed schemes maintain their advantages over the state-of-the-art schemes in a wide range of network conditions.

• The performance of block coding scheme is still the lowest among all schemes.

• TCP performs relatively well in the cases when PLR=5%, but they are extremely inefficient when PLR=15%. That is because its performance is very sensitive to packet losses. High loss rate will cause TCP to time out.

Case 3: Two hops with a moving relay node

The setup of this set of experiments is the following. There are three nodes in the network: a source node and a destination node are fixed, and a relay node is moving. The distance between the source node and the destination node is 1200 meters; the transmission range of each node is 700 meters. Hence, the source node cannot directly communicate with the destination node; a relay node is needed. The relay node is moving back and forth along the straight line, which is perpendicular to the straight line that links the source node and the destination node; in addition, the relay node has equal distance to the source node and the destination node. When the relay node moves into the transmission range of the source node, it can pick up the packets transmitted by the source node, and relay the packets to the destination node. When the relay node moves out of the transmission range of the source node, it cannot receive packets transmitted by the source node although the source node keeps transmitting; in this case, all the packets transmitted by the source node will be lost. The communication path from the source node to the destination node has two hops. Since the relay node moves around, the network topology is dynamic.

In this set of experiments, we stream the sequence coastguard, with C=0.8 (the corresponding R=26l 1), and T_D =Q.$s. Table V shows the IDR of schemes under Case 3. We have the following observations:

• DAF and DAF-L still perform the best among all the schemes. However, the decoding ratios are not as high as previous cases. That is because when the relay node temporarily moves out of the connection range, the source does not stop streaming video, therefore the content transmitted during the disconnecting period is lost.

• The performance of S-LT is worse than proposed schemes but better than others, and the block coding scheme is still among the worst schemes, for the same reasons as in Case 1.

• The performance of TCP scheme is also poor, because the disconnecting period causes the time-out in TCP.

V: DM coffi|Mosi»s under Cass 3,

References

The following references are incorporated herein by reference in their entireties:

[1] C. Perkins, RTP: Audio and Video for the Internet. Addison- Wesley Professional,

2003.

[2] Z. He, Y. Liang, L. Chen, I. Ahmad, and D. Wu, "Power-rate-distortion analysis for wireless video communication under energy constraints," IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 5, pp. 645-658, 2005. [3] J. W. Byers, M. Luby, and M. Mitzenmacher, "A digital fountain approach to asynchronous reliable multicast," IEEE Journal on Selected Areas in Communications, vol. 20, no. 8, pp. 1528-1540, 2002.

[4] M. Luby, "LT codes," in 2013 IEEE 54th Annual Symposium on Foundations of Computer Science. IEEE Computer Society, 2002, pp. 271-271.

[5] A. Shokrollahi, "Raptor codes," IEEE Transactions on Information Theory, vol. 52, no. 6, pp. 2551-2567, 2006.

[6] M. C. Bogino, P. Cataldi, M. Grangetto, E. Magli, and G. Olmo, "Sliding-window digital fountain codes for streaming of multimedia contents," in IEEE International

Symposium on Circuits and Systems, 2007. ISCAS 2007. IEEE, 2007, pp. 3467-3470.

[7] P. Cataldi, M. Grangetto, T. Tillo, E. Magli, and G. Olmo, "Sliding-window raptor codes for efficient scalable wireless video broadcasting with unequal loss protection," IEEE Transactions on Image Processing, vol. 19, no. 6, pp. 1491-1503, 2010.

[8] S. Ahmad, R. Hamzaoui, and M. M. Al-Akaidi, "Unequal error protection using fountain codes with applications to video communication," IEEE Transactions on Multimedia, vol. 13, no. l, pp. 92-101, 2011.

[9] D. Sejdinovic, D. Vukobratovic, A. Doufexi, V. Senk, and R. J. Piechocki, "Expanding window fountain codes for unequal error protection," IEEE Transactions on Communications, vol. 57, no. 9, pp. 2510-2516, 2009.

[10] D. Vukobratovic, V. Stankovic, D. Sejdinovic, L. Stankovic, and Z. Xiong,

"Scalable video multicast using expanding window fountain codes," IEEE Transactions on Multimedia, vol. 11, no. 6, pp. 1094-1104, 2009.

[11] G. Liva, E. Paolini, and M. Chiani, "Performance versus overhead for fountain codes over Fq," IEEE Communications Letters, vol. 14, no. 2, pp. 178-180, 2010.

[12] E. Hyyti^' a, T. Tirronen, and J. Virtamo, "Optimizing the degree distribution of LT codes with an importance sampling approach," in RESJJVI 2006, 6th International Workshop on Rare Event Simulation, 2006.

[13] , "Optimal degree distribution for LT codes with small message length," in

INFOCOM 2007. 26th IEEE International Conference on Computer Communications. IEEE. IEEE, 2007, pp. 2576-2580.

[14] P. Cataldi, M. P. Shatarski, M. Grangetto, and E. Magli, "Implementation and performance evaluation of LT and raptor codes for multimedia applications," in International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2006.

IIH-MSPO6. IEEE, 2006, pp. 263-266. [15] J. Ahrenholz, "Comparison of core network emulation platforms," in MILITARY COMMUNICATIONS CONFERENCE, 2010-MILCOM 2010. IEEE, 2010, pp. 166-171.

[16] U.S. Naval Research Laboratory, Networks and Communication Systems Branch. The extendable mobile ad-hoc network emulator (EMANE). [Online]. Available:

http://www.nrl.navy.mil/itd/ncs/products/emane

[17] P. Seeling and M. Reisslein, "Video transport evaluation with H.264 video traces," IEEE Communications Surveys and Tutorials, vol. 14, no. 4, pp. 1142-1165, 2012.

[18] x264 team. x264. [Online]. Available:

http ://w w w . videolan . org/developer s/x264.html

[19] GPAC project. MP4Box. [Online]. Available:

http ://gpac . wp .minestelecom.fr/mp4box/

[20] J. Klaue, B. Rathke, and A. Wolisz, "Evalvid-a framework for video transmission and quality evaluation," in Computer Performance Evaluation. Modelling Techniques and Tools. Springer, 2003, pp. 255-272.

[21] G. Holland and N. Vaidya, "Analysis of TCP performance over mobile ad hoc networks," Wireless Networks, vol. 8, no. 2/3, pp. 275-288, 2002.

[22] R. Palanki and J. S. Yedidia, "Rateless codes on noisy channels," in IEEE International Symposium on Information Theory. Citeseer, 2004, pp. 37-37.

Computing Environment

Techniques for increasing data throughput and decreasing transmission delay along a data link from a source node to a sink node may be implemented on any suitable hardware, including a programmed computing system. For example, FIG. 1 illustrates a system implemented with multiple computing devices, which may be distributed and/or centralized. Also, FIGS. 2 and 3 illustrate algorithms executing on at least one computing device. FIG. 19 illustrates an example of a suitable computing system environment 300 on which embodiments of these algorithms may be implemented. This computing system may be representative of a computing system that implements the described technique of increasing data throughput and decreasing transmission delay from a source node to a sink node via a relay node. However, it should be appreciated that the computing system environment 300 is only one example of a suitable computing environment and is not intended to suggest any limitation as to the scope of use or functionality of the invention. Neither should the computing environment 300 be interpreted as having any dependency or requirement relating to any one or combination of components illustrated in the exemplary operating environment 300. The invention is operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well-known computing systems, environments, and/or configurations that may be suitable for use with the invention include, but are not limited to, personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments, or cloud-based computing environments that include any of the above systems or devices, and the like.

The techniques described herein may be implemented in whole or in part within network interface 370. The computing environment may execute computer-executable instructions, such as program modules. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

With reference to FIG. 19, an exemplary system for implementing the invention includes a general purpose computing device in the form of a computer 310. Though a programmed general purpose computer is illustrated, it should be understood by one of skill in the art that algorithms may be implemented in any suitable computing device. Accordingly, techniques as described herein may be implemented in a system for increasing data throughput and decreasing transmission delay along a data link from a source node to a sink node. These techniques may be implemented in such network devices as originally manufactured or as a retrofit, such as by changing program memory devices holding programming for such network devices or software download. Thus, some or all of the components illustrated in FIG. 19, though illustrated as part of a general purpose computer, may be regarded as representing portions of a node or other component in a network system.

Components of computer 310 may include, but are not limited to, a processing unit 320, a system memory 330, and a system bus 321 that couples various system components including the system memory 330 to the processing unit 320. The system bus 321 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures. By way of example and not limitation, such architectures include Industry Standard Architecture (ISA) bus, Micro Channel

Architecture (MCA) bus, Enhanced ISA (EISA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus also known as

Mezzanine bus.

Computer 310 typically includes a variety of computer readable media. Computer readable media can be any available media that can be accessed by computer 310 and includes both volatile and nonvolatile media, removable and non-removable media. By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store the desired information and that can be accessed by computer 310. Communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term "modulated data signal" means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example and not limitation, communication media includes wired media such as a wired network or direct- wired connection, and wireless media such as acoustic, radio frequency (RF), infrared (IR), and other wireless media. Combinations of any of the above should also be included within the scope of computer readable media.

The system memory 330 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 331 and random access memory (RAM) 332. A basic input/output system 333 (BIOS), containing the basic routines that help to transfer information between elements within computer 310, such as during start-up, is typically stored in ROM 331. RAM 332 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 320. By way of example and not limitation, FIG. 19 illustrates operating system 334, application programs 335, other program modules 336, and program data 337.

The computer 310 may also include other removable/non-removable,

volatile/nonvolatile computer storage media. By way of example only, FIG. 19 illustrates a hard disk drive 341 that reads from or writes to non-removable, nonvolatile magnetic media, a magnetic disk drive 351 that reads from or writes to a removable, nonvolatile magnetic disk 352, and an optical disk drive 355 that reads from or writes to a removable, nonvolatile optical disk 356 such as a CD-ROM or other optical media. Other removable/non-removable, volatile/nonvolatile computer storage media that can be used in the exemplary operating environment include, but are not limited to, magnetic tape cassettes, flash memory cards, digital versatile disks, digital video tape, solid state RAM, solid state ROM, and the like. The hard disk drive 341 is typically connected to the system bus 321 through an non-removable memory interface such as interface 340, and magnetic disk drive 351 and optical disk drive 355 are typically connected to the system bus 321 by a removable memory interface, such as interface 350.

The drives and their associated computer storage media discussed above and illustrated in FIG. 19, provide storage of computer readable instructions, data structures, program modules, and other data for the computer 310. In FIG. 19, for example, hard disk drive 341 is illustrated as storing operating system 344, application programs 345, other program modules 346, and program data 347. Note that these components can either be the same as or different from operating system 334, application programs 335, other program modules 336, and program data 337. Operating system 344, application programs 345, other program modules 346, and program data 347 are given different numbers here to illustrate that, at a minimum, they are different copies. A user may enter commands and information into the computer 310 through input devices such as a keyboard 362 and pointing device 361, commonly referred to as a mouse, trackball, or touch pad. Other input devices (not shown) may include a microphone, joystick, game pad, satellite dish, scanner, or the like. These and other input devices are often connected to the processing unit 320 through a user input interface 360 that is coupled to the system bus, but may be connected by other interface and bus structures, such as a parallel port, game port, or a universal serial bus (USB). A monitor 391 or other type of display device is also connected to the system bus 321 via an interface, such as a video interface 390. In addition to the monitor, computers may also include other peripheral output devices such as speakers 397 and printer 396, which may be connected through an output peripheral interface 395.

The computer 310 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 380. The remote computer 380 may be a personal computer, a server, a router, a network PC, a peer device, or some other common network node, and typically includes many or all of the elements described above relative to the computer 310, although only a memory storage device 381 has been illustrated in

FIG. 19. The logical connections depicted in FIG. 19 include a local area network (LAN) 371 and a wide area network (WAN) 373, but may also include other networks. Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets, and the Internet.

When used in a LAN networking environment, the computer 310 is connected to the LAN 371 through a network interface or adapter 370. When used in a WAN networking environment, the computer 310 typically includes a modem 372 or other means for establishing communications over the WAN 373, such as the Internet. The modem 372, which may be internal or external, may be connected to the system bus 321 via the user input interface 360, or other appropriate mechanism. In a networked environment, program modules depicted relative to the computer 310, or portions thereof, may be stored in the remote memory storage device. By way of example and not limitation, FIG. 19 illustrates remote application programs 385 as residing on memory device 381. It will be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers may be used.

Having thus described several aspects of at least one embodiment of this invention, it is to be appreciated that various alterations, modifications, and improvements will readily occur to those skilled in the art.

In some embodiments, techniques described herein may be used in streaming multimedia services. Such services may include streaming multimedia, such as video and/or audio, over a network such as the Internet between a streaming server and a client. Some embodiments of a streaming service may be over a different network, such as a LAN, where the streaming server may be installed on a computer within the premises of a customer, such as a house or office building. Alternatively or additionally, the streaming server may be geographically remote relative to the clients, and the connection between the server and clients may be a dedicated wireless connection. Alternatively, the connection may be over a shared network such as a 5G cellular network.

Such alterations, modifications, and improvements are intended to be part of this disclosure, and are intended to be within the spirit and scope of the invention. Further, though advantages of the present invention are indicated, it should be appreciated that not every embodiment of the invention will include every described advantage. Some embodiments may not implement any features described as advantageous herein and in some instances.

Accordingly, the foregoing description and drawings are by way of example only.

The above-described embodiments of the present invention can be implemented in any of numerous ways. For example, the embodiments may be implemented using hardware, software or a combination thereof. When implemented in software, the software code can be executed on any suitable processor or collection of processors, whether provided in a single computer or distributed among multiple computers. Such processors may be implemented as integrated circuits, with one or more processors in an integrated circuit component. Though, a processor may be implemented using circuitry in any suitable format.

Further, it should be appreciated that a computer may be embodied in any of a number of forms, such as a rack-mounted computer, a desktop computer, a laptop computer, or a tablet computer. Additionally, a computer may be embedded in a device not generally regarded as a computer but with suitable processing capabilities, including a Personal Digital Assistant (PDA), a smart phone, or any other suitable portable or fixed electronic device.

Also, a computer may have one or more input and output devices. These devices can be used, among other things, to present a user interface. Examples of output devices that can be used to provide a user interface include printers or display screens for visual presentation of output and speakers or other sound generating devices for audible presentation of output. Examples of input devices that can be used for a user interface include keyboards, and pointing devices, such as mice, touch pads, and digitizing tablets. As another example, a computer may receive input information through speech recognition or in other audible format.

Such computers may be interconnected by one or more networks in any suitable form, including as a local area network or a wide area network, such as an enterprise network or the Internet. Such networks may be based on any suitable technology and may operate according to any suitable protocol and may include wireless networks, wired networks, or fiber optic networks.

Also, the various methods or processes outlined herein may be coded as software that is executable on one or more processors that employ any one of a variety of operating systems or platforms. Additionally, such software may be written using any of a number of suitable programming languages and/or programming or scripting tools, and also may be compiled as executable machine language code or intermediate code that is executed on a framework or virtual machine.

In this respect, the invention may be embodied as a computer readable storage medium (or multiple computer readable media) (e.g., a computer memory, one or more floppy discs, compact discs (CD), optical discs, digital video disks (DVD), magnetic tapes, flash memories, circuit configurations in Field Programmable Gate Arrays or other semiconductor devices, or other tangible computer storage medium) encoded with one or more programs that, when executed on one or more computers or other processors, perform methods that implement the various embodiments of the invention discussed above. As is apparent from the foregoing examples, a computer readable storage medium may retain information for a sufficient time to provide computer-executable instructions in a non-transitory form. Such a computer readable storage medium or media can be transportable, such that the program or programs stored thereon can be loaded onto one or more different computers or other processors to implement various aspects of the present invention as discussed above. As used herein, the term

"computer-readable storage medium" encompasses only a computer-readable medium that can be considered to be a manufacture (i.e., article of manufacture) or a machine. Alternatively or additionally, the invention may be embodied as a computer readable medium other than a computer-readable storage medium, such as a propagating signal.

The terms "program" or "software" are used herein in a generic sense to refer to any type of computer code or set of computer-executable instructions that can be employed to program a computer or other processor to implement various aspects of the present invention as discussed above. Additionally, it should be appreciated that according to one aspect of this embodiment, one or more computer programs that when executed perform methods of the present invention need not reside on a single computer or processor, but may be distributed in a modular fashion amongst a number of different computers or processors to implement various aspects of the present invention.

Computer-executable instructions may be in many forms, such as program modules, executed by one or more computers or other devices. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Typically the functionality of the program modules may be combined or distributed as desired in various embodiments.

Also, data structures may be stored in computer-readable media in any suitable form. For simplicity of illustration, data structures may be shown to have fields that are related through location in the data structure. Such relationships may likewise be achieved by assigning storage for the fields with locations in a computer-readable medium that conveys relationship between the fields. However, any suitable mechanism may be used to establish a relationship between information in fields of a data structure, including through the use of pointers, tags, or other mechanisms that establish relationship between data elements.

Various aspects of the present invention may be used alone, in combination, or in a variety of arrangements not specifically discussed in the embodiments described in the foregoing and is therefore not limited in its application to the details and arrangement of components set forth in the foregoing description or illustrated in the drawings. For example, aspects described in one embodiment may be combined in any manner with aspects described in other embodiments.

Also, the invention may be embodied as a method, of which an example has been provided. The acts performed as part of the method may be ordered in any suitable way. Accordingly, embodiments may be constructed in which acts are performed in an order different than illustrated, which may include performing some acts simultaneously, even though shown as sequential acts in illustrative embodiments.

Use of ordinal terms such as "first," "second," "third," etc., in the claims to modify a claim element does not by itself connote any priority, precedence, or order of one claim element over another or the temporal order in which acts of a method are performed, but are used merely as labels to distinguish one claim element having a certain name from another element having a same name (but for use of the ordinal term) to distinguish the claim elements.

Also, the phraseology and terminology used herein is for the purpose of description and should not be regarded as limiting. The use of "including," "comprising," or "having," "containing," "involving," and variations thereof herein, is meant to encompass the items listed thereafter and equivalents thereof as well as additional items.

In the attached claims, various elements are recited in different claims. However, the claimed elements, even if recited in separate claims, may be used together in any suitable combination.

Claims

Claims What is claimed is:

1. A network system for increasing data throughput and decreasing transmission delay along a data link from a source node to a sink node, the network system comprising:

a first node configured to:

encode a plurality of video data packets using rateless coding based on at least one video encoding characteristic of a video source of the plurality of video data packets by adaptively adjusting at least one parameter of the rateless coding based on the at least one video encoding characteristic, and

transmit, over the data link to at least one second node, the plurality of video data packets.

2. The network system of claim 1, wherein:

the rateless coding comprises fountain coding, and

adaptively adjusting the at least one parameter of the rateless coding comprises:

adaptively selecting a first length of a first data window and a second length of a second data window of a plurality of overlapping sliding data windows, wherein:

the selecting is based on a number of bits in frames of the video source and/or on a first number of frames in the video source,

the first length comprises a second number of frames that the first data window can accommodate, and

the second length comprises a third number of frames that the second data window can accommodate; and

segmenting data from the frames of the video source into the plurality of video data packets based on at least the first length and the second length.

3. The network system of claim 2, wherein:

the plurality of overlapping sliding data windows are non-homogeneous.

4. The network system of claim 3, wherein:

the plurality of overlapping sliding data windows collectively have more than one length and/or more than one sampling distribution.

5. The network system of claim 2, wherein:

adaptively adjusting a first sampling distribution for the first data window and a second sampling distribution for the second data window, wherein:

the adjusting is based on a video bit rate of the video source, and the video bit rate is variable.

6. The network system of claim 2, wherein:

the first node is configured to obtain the at least one video encoding characteristic by preprocessing the video source.

7. The network system of claim 5, wherein:

the first node is configured to:

store the first length of the first data window in a header of a first packet of the first data window,

store the second length of the second data window in a header of a second packet of the second data window,

store the first sampling distribution in the header of the first packet of the first data window as a first slope factor, and

store the second sampling distribution in the header of the second packet of the second data window as a second slope factor.

8. The network system of claim 1, wherein:

the at least one video encoding characteristic comprises:

a frame rate of the video source,

a number of frames in a group of pictures of the video source, and/or a video bit rate of the video source, the video bit rate being variable.

9. The network system of claim 1, wherein:

the data link is at least partially wireless.

10. The network system of claim 1 , wherein at least one video data packet of the plurality of video data packets comprises at least 100 bits.

11. The network system of claim 1, wherein:

the first node comprises a streaming video server.

12. The network system of claim 11, wherein:

the at least one second node comprises:

a sink node configured to receive one or more of the plurality of video data packets from the streaming video server via at least one relay node, or

at least one relay node configured to receive at least one of the plurality of video data packets from the streaming video server.

13. At least one computer-readable storage medium encoded with executable instructions that, when executed by at least one processor, cause the at least one processor to perform a method for transmitting a video stream over a data link from a source node to a sink node via a relay node, the method comprising:

encoding a plurality of video data packets using rateless coding based on at least one video encoding characteristic of a video source of the plurality of video data packets by adaptively adjusting at least one parameter of the rateless coding based on the at least one video encoding characteristic; and

transmitting, over the data link, the plurality of video data packets.

14. The at least one computer-readable storage medium of claim 13, wherein: