We provide different data sets for training and for evaluation using a controlled network testbed. The testbed is located at Leibniz University Hannover and comprises about 80 machines that are each connected by a minimum of 4 Ethernet links of 1 Gbps and 10 Gbps capacity via VLAN switches. The testbed is managed by the Emulab software that configures the machines as hosts and routers and connects them using VLANs to implement the desired topology. We use a dumbbell topology with multiple tight links. To emulate the characteristics of the links, such as capacity, delay, and packet loss, additional machines are employed by Emulab. We use the MoonGen software for emulation of link capacities that differ from the native Ethernet capacity. To achieve an accurate spacing of packets that matches the emulated capacity, MoonGen fills the gaps between packets by dummy frames that are discarded at the out put of the link. We use the forward rate Lua script for the MoonGen API to achieve the desired forwarding rate for the transmission and reception ports of MoonGen.
Cross traffic of different types and intensities is generated using D-ITG. The cross traffic is single hop-persistent, i.e., at each link fresh cross traffic is multiplexed. The probe traffic is path-persistent, i.e., it travels along the entire network path, to estimate the end-to-end available bandwidth. We use RUDE & CRUDE to generate UDP probe streams. Packet timestamps at the probe sender and receiver are generated at sender and receiver, respectively, using libpcap at the hosts. We also use a specific endace DAG measurement card to obtain accurate reference timestamps. The timestamps are used to compute input and output rates for each packet train.
The dataset consists of k-dimensional matrix of data ratios at the sender and the receiver, which are input to the neural network, and input rate to calculate rate increment with respect to which the available bandwidth and bottleneck capacity is normalized. The dataset represents different network parameters, and addresses problems known to be difficult in the bandwidth estimation. To evaluate the scale-invariance approach of our proposed method with respect to network capacity, we have generated two datasets with different single tight link capacity. The problem of underestimation of avail-able bandwidth in a multi-hop network is addressed using third set for multiple tight links. Further to increase the difficulty of problem in a multi-hop network, the fourth set is generated for networks where the tight link is different from the bottleneck link.
The first dataset consists samples for a single tight link with the capacity C = 100 Mbps and exponential cross traffic with an average rate of cross traffic = 25, 50, and 75 Mbps. The seconddata set is again for a single tight link but with the different capacity C = 50 Mbps and the exponential cross traffic has an average rate of 12.5, 25, and 37.5 Mbps, respectively. The third data set is generated for multiple tight links with the capacity C = 100 Mbps and exponential cross traffic with an average rate of 50 Mbps. The fourth data set is for the networks where the tight link is different from the bottlenecklink, considering two scenarios: first, in which the tight link follows the bottleneck link and second, in which it precedes the latter, respectively. In both the scenarios, the tight link capacity is C = 100 Mbps and the bottleneck capacity is Cb = 50 Mbps with cross traffic intensity of 75 Mbps and 12.5 Mbps respectively.
These traces are in the zipped folder and can be downloaded by clicking the link BandwidthEstimationTraces. The folder consists of two separate subfolders for training and testing. The cross traffic burstiness is considered by taking exponential traffic "et", constant bit rate traffic "ct", and Pareto traffic "pt".