ufes-20191-redes-mininet/proj-artigo/on_topologies_perfomance.tex

\documentclass[a4paper,11pt]{article}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{lmodern}
% \usepackage[portuguese]{babel}
\usepackage[margin=2cm]{geometry}
\usepackage{fancyhdr}
\usepackage[backend=bibtex,style=numeric,citestyle=numeric,maxcitenames=3]{biblatex}
\usepackage{hyperref}
\usepackage{float}
\usepackage{multicol}
\usepackage{listings}
\usepackage{graphicx}
\usepackage{color}
\usepackage{colortbl}
\usepackage[table]{xcolor}
\usepackage{xcolor}
\usepackage[mathscr]{euscript}
\let\euscr\mathscr \let\mathscr\relax% just so we can load this and rsfs
\usepackage[scr]{rsfso}
\newcommand{\powerset}{\raisebox{.15\baselineskip}{\Large\ensuremath{\wp}}}

\pagestyle{myheadings}
\setlength{\parindent}{0pt}
\setlength{\parskip}{0.85em}

\title{Comparative performance evaluation of Low Delay Routing on emulated environment}
\author{Ádler Oliveira Silva Neves \\ \href{mailto:adler.neves@aluno.ufes.br}{\texttt{adler.neves@aluno.ufes.br}}}
\date{}

\bibliography{gvozdiev2018onlowlatency}
\bibliography{awslambdapricing}
\bibliography{lantz2010network}
\bibliography{ryusdnframework}
\bibliography{bui2015rethinking}
\bibliography{kelton2017improving}
\bibliography{pries2011power}
\bibliography{alshawi2018clos}
\bibliography{neves2019comparator}
\bibliography{matplotlib}
\bibliography{pythonternary}

\begin{document}
\maketitle


\begin{multicols}{2}

\section*{Abstract}
% \begin{abstract}
On a computer network, packets hop from network element to network element as a way to propagate a message that is expected to reach at its destination.
How the network is doing this, however, is constantly changing, as Software Defined Networks brought programmability to the switches that forward packets.
With some web services getting billed by the millisecond, the load speed influencing both user experience and battery usage on mobile devices, latency became crucial.
There's a routing algorithm by \citeauthor{gvozdiev2018onlowlatency} (\citeyear{gvozdiev2018onlowlatency}) which has promising but untested results, which may be subject to unknown limitations.
Here we show that Low Delay Routing doesn't handle extreme cases of overload well.
We found that when all routes are congested, Low Delay Routing prioritize bandwidth over both latency and jitter.
Our results demonstrate that such routing algorithm achieves low latency as a side-effect of maximizing bandwidth, which isn't guaranteed happen on all cases.
These cases in which low latency was still achieved were with CLOS and B-Cube topologies.
We anticipate that the major contribution of our article to be the framework used to compare more algorithms on more topologies.
% \end{abstract}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Introduction} % You will find background information and a statement of the author's hypothesis in the introduction

On a world with web services billed by the millisecond, such as AWS Lambda\cite{awslambdapricing}, time is money.
Accessing a remote database from such environment means that the time spent waiting the connection to establish is money lost.
On the web, such delay in the response from the server side increases power consumption on mobile devices\cite{bui2015rethinking} and decreases user experience\cite{kelton2017improving}.
Therefore, reducing such latency is crucial and would bring clear benefits.

\citeauthor{gvozdiev2018onlowlatency} proposes Low Delay Routing\cite{gvozdiev2018onlowlatency} (LDR) as a solution for this problem, which is a network routing algorithm that got impressive results on the metrics that reflect its purpose.
Despite its attractive results on latency, there's no in-depth evaluation of bandwidth and jitter.
Storage and processing components usually requires that the owner chooses up to two attributes from high speed, high capacity and low cost, and, extending such dilemma to the network area, LDR had to give up somewhere.
But where? Are there other limits which were not mentioned?

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Materials and Methods} % The methods section will help you determine exactly how the authors performed the experiment.
To answer that question, we set up a testing environment with Mininet, Ryu and some own code on a PC equipped with an i7 4790 (4 cores, 8 threads, 4GHz) and 16GB RAM DDR3@1600MHz, running Arch Linux.
``Mininet is a system for rapidly prototyping large networks on the constrained resources of a single laptop''\cite[p. 1]{lantz2010network}, which was used to create the emulated network and run commands on the hosts it created.
``Ryu is a component-based software defined networking framework''\cite{ryusdnframework}, which was used to define the behavior that the switches would have using OpenFlow.
Our own code\cite{neves2019comparator} is a collection of Python scripts and a Makefile, including our partial re-implementations of \citeauthor{gvozdiev2018onlowlatency}'s algorithm as a Ryu controller, which were responsible for routing, monitoring, visualization, automatic testing and both table and chart generation.
These tools generated the data that will be discussed on section \ref{sec:res}, but, first, we will explain more on how our tool set does what it does on the next subsections.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\subsection{Representing the topologies}
\label{ssec:reprtopo}
The topologies are stored as JSON files.
Such file contains an array that contains 3 arrays.
The first array is a list of strings that represents the list of hosts.
The second array is a list of strings that represents the list of switches.
The last array is a list of arrays that represents the list of links.
Every array that represents a link contain an two strings that represents either a host or a switch, and the third element is a number that represents the maximum bandwidth of such link.
Then, it comes the time to present the topologies.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\subsection{The topologies}
\label{sec:topo}
We tested 9 topologies.
Switches (starts with ``s'', as in ``s3'') were represented by blue circles.
Hosts (starts with ``h'', as in ``h7'') were represented by green circles.
When there is no indication of speed on the edge, the speed is 1 mbps.
The topologies are:

\begin{description}
  \item[Triangle:] This topology (figure \ref{fig:triangle}) is a triangle that has a longer and slower path that only becomes a viable path when the shorter and faster is congested.
    \begin{figure}[H]
      \begin{center}
        \includegraphics[width=0.6\linewidth]{topos/principle.pdf}
        \caption{``Triangle'' topology}
        \label{fig:triangle}
      \end{center}
    \end{figure}

  \item[Fat tree:] Essentially, a complete bipartite graph $k_{3,8}$ where every pair of 2 switches of the group with 8 nodes make another $k_{2,2}$ graph where the new switches which are more distant from the 3 ones from the core gets 2 hosts connected to each one. This topology (figure \ref{fig:fattree}) is equivalent to \citeauthor{pries2011power}'s\cite{pries2011power} one.
    \begin{figure}[H]
      \begin{center}
        \includegraphics[width=0.6\linewidth]{topos/fattree.pdf}
        \caption{``Fat tree'' topology}
        \label{fig:fattree}
      \end{center}
    \end{figure}

  \item[3-layered CLOS:] A 3-stage CLOS fabric as defined by \citeauthor{alshawi2018clos}\cite{alshawi2018clos}, which has 3 switches acting as spine and 8 as access leaves, with 2 hosts per access leaf. It can also be seen as an $k_{3,8}$ (as in Fat tree, but without that $k_{2,2}$ process) with 2 nodes on every one of those 8 nodes, as expressed on figure \ref{fig:clos}.
    \begin{figure}[H]
      \begin{center}
        \includegraphics[width=0.6\linewidth]{topos/clos.pdf}
        \caption{``3-layered CLOS'' topology}
        \label{fig:clos}
      \end{center}
    \end{figure}

  \item[Bipartite:] The same as the previous one (3-layered CLOS), but with an extra switch connecting all 3 spine switches, as expressed on figure \ref{fig:bipartite}.
    \begin{figure}[H]
      \begin{center}
        \includegraphics[width=0.6\linewidth]{topos/bipartite.pdf}
        \caption{``Bipartite'' topology}
        \label{fig:bipartite}
      \end{center}
    \end{figure}

  \item[Binary tree:] A full binary tree with $depth=5$ where the childless nodes are hosts and the other ones are switches, as seen in figure \ref{fig:bintree}.
    \begin{figure}[H]
      \begin{center}
        \includegraphics[width=0.6\linewidth]{topos/simpletree.pdf}
        \caption{``Binary tree'' topology}
        \label{fig:bintree}
      \end{center}
    \end{figure}

  \item[5-layered CLOS:] A 5-stage CLOS fabric as defined by \citeauthor{alshawi2018clos}\cite{alshawi2018clos}, which has 2 rows of 4 access leaves each, 3 rows of 2 spine switches each, and 2 hosts on each access leaf, as seen in figure \ref{fig:clos5}.
    \begin{figure}[H]
      \begin{center}
        \includegraphics[width=0.6\linewidth]{topos/clos5.pdf}
        \caption{``5-layered CLOS'' topology}
        \label{fig:clos5}
      \end{center}
    \end{figure}

  \item[Grid:] This topology (figure \ref{fig:grid}) aims to represent a mesh of squares, which is a trivial solution for physically interconnecting columns rows and rows full of servers.
    \begin{figure}[H]
      \begin{center}
        \includegraphics[width=0.6\linewidth]{topos/grid.pdf}
        \caption{``Grid'' topology}
        \label{fig:grid}
      \end{center}
    \end{figure}

  \item[DCell:] This topology is the same as the one presented in \citeauthor{pries2011power}'s work \cite{pries2011power} and replicated on figure \ref{fig:dcellunused}, however Mininet hosts don't forward packets to other hosts by default and don't obey OpenFlow rules, so a change was needed.
    \begin{figure}[H]
      \begin{center}
        \includegraphics[width=0.6\linewidth]{topos/dcell.pdf}
        \caption{``DCell'' topology}
        \label{fig:dcellunused}
      \end{center}
    \end{figure}
    After such change, every host has become a switch with a host connected to it, and all connections the host had, the switch gets them all, as seen in figure \ref{fig:dcell}.
    \begin{figure}[H]
      \begin{center}
        \includegraphics[width=0.6\linewidth]{topos/dcellswitched.pdf}
        \caption{``DCell'' topology with extra switches}
        \label{fig:dcell}
      \end{center}
    \end{figure}

  \item[BCube:] This topology is the same as the one presented in \citeauthor{pries2011power}'s work \cite{pries2011power} and replicated on figure \ref{fig:bcubeunused}, however the same problem as the one present on DCell repeats.
    \begin{figure}[H]
      \begin{center}
        \includegraphics[width=0.6\linewidth]{topos/bcube.pdf}
        \caption{``BCube'' topology}
        \label{fig:bcubeunused}
      \end{center}
    \end{figure}
    The same solution as applied on DCell also repeats here, as seen in figure \ref{fig:bcube}.
    \begin{figure}[H]
      \begin{center}
        \includegraphics[width=0.6\linewidth]{topos/bcubeswitched.pdf}
        \caption{``BCube'' topology with extra switches}
        \label{fig:bcube}
      \end{center}
    \end{figure}
\end{description}

With these 9 topologies defined, we chose to run 30 tests on them.
However, there was no defined method of testing those topologies in a way that the numbers would reflect routing algorithms' ability to maintain low latency, low jitter and high bandwidth even when congestion is unavoidable.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\subsection{Testing the topologies}
Suppose we have an array of hosts, such as:

\begin{center}
  \begin{tabular}{|c|c|c|c|c|c|c|c|c|}
    \hline
    h1 & h2 & h3 & h4 & h5 & h6 & h7 & h8 & h9 \\
    \hline
  \end{tabular}
\end{center}

An easy way to get pairs to test is split such array in path, discarding the extra member:

\begin{center}
  \begin{tabular}{c|c||c|c|c|c}
    \cline{2-5}
    \textcolor{white}{hx} & h1 & h2 & h3 & h4 & \textcolor{gray}{h5} \\
    \cline{2-5}
    \textcolor{white}{hx} & h6 & h7 & h8 & h9 & \textcolor{white}{hx} \\
    \cline{2-5}
  \end{tabular}
\end{center}

That done, {\tt h1} and {\tt h6} would be latency tests and all other 3 pairs are bandwidth tests.
This would be a great simple solution if {\tt h1} and {\tt h6} could be more distant to each other on Grid topology (figure \ref{fig:grid}).
Keeping this as is would bring a latency test which result would reflect more about the capacity of the routing algorithm on making a better use of the network.
A host which is consistently the most distant is the last one.
As {\tt h1} and {\tt h6} were, on some topologies, closer to each other than to {\tt h9} (the last element).
Therefore, it was chosen to reverse the second part.
The final combination would be:

\begin{center}
  \begin{tabular}{|c||c|c|c|}
    \hline
    h1 & h2 & h3 & h4 \\
    \hline
    h9 & h8 & h7 & h6 \\
    \hline
  \end{tabular}
\end{center}

That said, {\tt h1} and {\tt h9} are latency tests and all other 3 pairs are bandwidth tests.

Our testing process starts the topology, wait the controller be ready, run the tests, write the results and finishes itself.
All tests are run sequentially and then concurrently, in order to measure all links in its idle and overloaded states.
Between every test there's a wait time that is exactly 2 times the monitoring interval, which is the minimal interval to let the routing algorithm see the entire network as unused.
Latency is {\tt avg} from {\tt ping} command; jitter, {\tt mdev} from the same command; bandwidth, the bottom line of the {\tt iperf} command running as client, using TCP.
The bulk testing script we have also write controller's configuration to change the routing algorithm being tested, starts the controller, run a giver number of tests (we chose 30) for each pair routing algorithm, ends the execution of the controller and destroys the topology and, after all tests were ran, store all results in a single file.

Such file that contains the results of all tests was later used to generate the tables, box and whisker plots and a ternary plot that will be presented on section \ref{sec:res}.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\subsection{The OpenFlow controller}
Our OpenFlow controller was written using Ryu as framework.
To avoid going over the task of discovering the topology and recovering from topology changes at run-time, it was chosen to let the topology be static and pass the topology (section \ref{ssec:reprtopo}) as a command-line argument because this wouldn't impact the tests.
Translating \citeauthor{gvozdiev2018onlowlatency}'s Points of Presence to Mininet's hosts, it was possible to map the concepts the article mentions into the framework we have.
It was tested 5 routing algorithms: Spanning Tree (Dijkstra-based), 2 implementations of LDR, ECMP (equal-cost multiple path) and a reimplementation of MinMax from \citeauthor{gvozdiev2018onlowlatency}'s textual description.

\citeauthor{gvozdiev2018onlowlatency}'s LDR algorithm is inherently multi-path, using different paths as some sort of link aggregation, which means that packets may not necessarily arrive at the order it was sent.
So, keeping this nature, the linear program from \citeauthor{gvozdiev2018onlowlatency}'s figure 12 \cite[p. 96]{gvozdiev2018onlowlatency} was re-implemented using the library PuLP to solve such linear program, using link load obtained from monitoring and latency on link derives from; figure 13's \cite[p. 96]{gvozdiev2018onlowlatency} conditional path set extension was adapted to always give the linear program the paths the pass a filter that only allow a path stretch up to 1.4 \cite[p. 89]{gvozdiev2018onlowlatency} after applied \citeauthor{gvozdiev2018onlowlatency}'s algorithm 1 \cite[p. 94]{gvozdiev2018onlowlatency}.
This can be summed up as the flow: Monitoring $\rightarrow$ Algorithm 1  $\rightarrow$ APA-based filter doing the role of figure 13 $\rightarrow$ Figure 12 $\rightarrow$ Flow table.
The steps about the ``iteration to assess statistical multiplexing'' as expressed on \citeauthor{gvozdiev2018onlowlatency}'s figure 14 \cite[p. 96]{gvozdiev2018onlowlatency} were simply ignored because we considered it to be very abstract with many possible interpretations and, as consequence, hard to re-implement without re-implementing something the authors did it in a very different manner and achieving completely different results.
We called this ``LDR (multi-path)'' on charts.

However, not all applications handles packets arriving out of their order well.
So, instead of introducing a packet queue (increasing delay and possibly introducing buffer bloat as a concern), we could also change the routing algorithm to choose only one path and change later, as needed.
Instead of solving a linear program, \citeauthor{gvozdiev2018onlowlatency}'s figure 12 \cite[p. 96]{gvozdiev2018onlowlatency} could be interpreted as a cost function, where we should pick the path on the aggregate which is the minimum.
Such cost function would be $C(p) = d_p + \frac{d_p M_1}{S_a} + M_2O_{max} + \sum_{l}^{}{O_l}$, which is the per-aggregate part of the sum which is part of the linear program from \citeauthor{gvozdiev2018onlowlatency}'s figure 12 with the restriction of choosing only one path ($x_{ap} = 1$).
Then, \citeauthor{gvozdiev2018onlowlatency}'s figure 13 \cite[p. 96]{gvozdiev2018onlowlatency} can be applied by interpreting the path set extension as a path change, and the decision if there's any link overloaded as a decision if there's a change that figure 12 suggested that would make the routing better (less congested).
As there is only one path being used by aggregate, there is no statistical multiplexing to be assessed on \citeauthor{gvozdiev2018onlowlatency}'s figure 14 \cite[p. 96]{gvozdiev2018onlowlatency}, so it was ignored.
We called this ``LDR (single-path)'' on charts.

\citeauthor{gvozdiev2018onlowlatency}'s figure 12 \cite[p. 96]{gvozdiev2018onlowlatency} uses 2 constants: $M_1$ and $M_2$.
The only hint to give $M_1$ a value was ``to ensure this is only a tie-break, $M_1$ is a very small constant''\cite[p. 96]{gvozdiev2018onlowlatency}.
As a wild guess over that vague description and seeing its use on \citeauthor{gvozdiev2018onlowlatency}'s figure 12 \cite[p. 96]{gvozdiev2018onlowlatency} as being multiplied by a fraction that ranges from 0 to 1, we started Mininet with its default topology and ran a \texttt{ping} between the two hosts; as the value was $0.6ms$, then we set $M_1$ to be $0.0006$.
$M_2$, on its time, acts over the maximum overload of a given path ``[...] subject to the constraint that they avoid congestion; $M_2$ is large to ensure this term dominates''\cite[p. 95]{gvozdiev2018onlowlatency}.
As another wild guess, we set that value to be 120; with particular reason and with no clue if such value being too large could have negative side-effects, but with the suspicion that $M_2$ should not be a constant, but a metric of such node in a graph that represents the network, however we didn't investigate that possibility on this article.
That said, $M_1$ and $M_2$ lacks better explanation on what it does and how could we calculate them.

Our Dijkstra-based Spanning Tree algorithm takes the advantage of knowing the entire topology (which is also static) and calculates the shortest path for each pair of hosts using Dijskstra.
The advantage of doing Spanning Tree this way and not through OSPF is that we will not be sending the topology over the links periodically, being able to keep the topology completely idle while no host is sending or receiving packets.
No routing difference is expected to exist between this approach and OSPF.
Even that it's not OSPF on a strict sense, we called this ``OSPF (single-path)'' on charts.

Our ECMP implementation lists all paths between two hosts, discards the longer paths which have conflicting flow directions over the same link as shorter ones (or else we would have packet loss) and creates a static link aggregation that uses all remaining flows to deliver packages.
Similarly as ``LDR (multi-path)'', this routing algorithm is expected to have issues with applications which doesn't handle packets arriving out of their order well.
We called this ``ECMP (multi-path)'' on charts.

\citeauthor{gvozdiev2018onlowlatency} describes ``[...] a pure MinMax approach [as one that] optimizes traffic placement so as to minimize the maximum link utilization[...]''\cite[p. 96]{gvozdiev2018onlowlatency}.
Therefore, we wrote an algorithm that takes every host pair from $\powerset(H) \setminus \{(h,h) \forall h \in H\}$ (where $H$ is a set of hosts), simulate all possible paths for that source-destination pair, pick the path that has the minimum maximum path load and update the current network model to have that traffic moved to that chosen path; after all source-destination pairs have been processed, our network model will hold the future network topology.
We called this ``MinMax (single-path)'' on charts.

With all topologies, algorithms and testing methodology defined, we set our topology update cycle to 1 second and ran our tests over some days and we can, now, proceed to the results.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Results} % The results section contains the data collected during experimentation.
\label{sec:res}

\begin{table*}
  \caption{Average of all concurrent latency tests}
  \label{tab:pl}
  \begin{center}
    \resizebox{\linewidth}{!}{\input{boxplots/avg_pl.tex}}
    \newline
    {\footnotesize Lower is better}
  \end{center}
\end{table*}

\begin{table*}
  \caption{Average of all concurrent jitter tests}
  \label{tab:pj}
  \begin{center}
    \resizebox{\linewidth}{!}{\input{boxplots/avg_pj.tex}}
    \newline
    {\footnotesize Lower is better}
  \end{center}
\end{table*}

\begin{table*}
  \caption{Average of all concurrent bandwidth tests}
  \label{tab:pb}
  \begin{center}
    \resizebox{\linewidth}{!}{\input{boxplots/avg_pb.tex}}
    \newline
    {\footnotesize Higher is better}
  \end{center}
\end{table*}

\begin{table*}
  \caption{Average routing algorithm's first response time on controller initialization}
  \label{tab:rd}
  \begin{center}
    \resizebox{\linewidth}{!}{\input{boxplots/avg_rd.tex}}
    \newline
    {\footnotesize Lower is better}
  \end{center}
\end{table*}

%%%%%%%%%%%%%%%%%%

\begin{table*}
  \caption{Deviation from average of all concurrent latency tests}
  \label{tab:dpl}
  \begin{center}
    \resizebox{\linewidth}{!}{\input{boxplots/avgdiff_pl.tex}}
    \newline
    {\footnotesize Lower is better}
  \end{center}
\end{table*}

\begin{table*}
  \caption{Deviation from average of all concurrent jitter tests}
  \label{tab:dpj}
  \begin{center}
    \resizebox{\linewidth}{!}{\input{boxplots/avgdiff_pj.tex}}
    \newline
    {\footnotesize Lower is better}
  \end{center}
\end{table*}

\begin{table*}
  \caption{Deviation from average of all concurrent bandwidth tests}
  \label{tab:dpb}
  \begin{center}
    \resizebox{\linewidth}{!}{\input{boxplots/avgdiff_pb.tex}}
    \newline
    {\footnotesize Higher is better}
  \end{center}
\end{table*}

After all tests ran, we got the 20 tables, 170 box plots and 10 ternary scatter plots.
Due to space restrictions, lack of relevance of some collected data and limited ability to analyze so many facets from a data matrix with 4 dimensions (routing algorithm, topology, metric and sample), we will only display here 7 tables, 33 box plots and a single ternary plot.

The tables \ref{tab:pl} to \ref{tab:rd} have a line and a column named ``All'', which represents the union of the data from all neighbor lines and columns, respectively.
Such union was, later, used to quickly represent all topologies on a single plot.
This can be observed on the box plot figures \ref{fig:bpl} to \ref{fig:bpb}, which were generated using matplotlib\cite{matplotlib}, displaying the median as a green line, the mean as a blue diamond and the outliers as red ``$\times$''s.
Figure \ref{fig:tern} was also plotted using such from ``All'' column, combining individual values of concurrent latencies, jitter and bandwidth, plotted using python-ternary\cite{pythonternary}.

Spotting out the outliers on the tables \ref{tab:pl} to \ref{tab:pb} is difficult, the values from the columns were subtracted from the column ``All''.
Therefore, the tables \ref{tab:dpl} to \ref{tab:dpb} contain the deviation of such algorithm's average on a given topology from all topologies' algorithm average.
On the table \ref{tab:dpl}, we can see that on ``B-Cube'' and on ``3-layered CLOS'' the values of both ``LDR (single-path)'' and ``MinMax (single-path)'' on the table are negative while all others are positive, ``5-layered CLOS'', ``B-Cube'' and ``Triangle'' are positive on ``LDR (multi-path)'' while all others are negative, ``Binary Tree'', ``B-Cube'' and ``3-layered CLOS'' are positive on ``ECMP (multi-path)'' while all others are negative; the value ``ECMP (multi-path)'' $\times$ ``5-layered CLOS'' is the only 3-digit number (after truncation) of the entire table.
On the table \ref{tab:dpj}, we can see that on the line ``Binary Tree'', ``B-Cube'' and ``D-Cell'' the values of both ``LDR (single-path)'' and ``MinMax (single-path)'' on the table are negative while all others are positive; on ``LDR (multi-path)'', ``5-layered CLOS'', ``B-Cube'' and ``Triangle'' are positive while other values are negative; on ``ECMP (multi-path)'', ``Binary Tree'', ``B-Cube'', ``D-Cell'' and ``3-layered CLOS'' are positive while other values are negative; again, ``ECMP (multi-path)'' $\times$ ``5-layered CLOS'' is the most significative value of the table.
On table \ref{tab:dpb}, we can see that on the line ``B-Cube'' and ``Triangle'' the values of both ``LDR (single-path)'' and ``MinMax (single-path)'' on the table are positive while all others are negative; on ``LDR (multi-path)'', ``Binary Tree'', ``5-layered CLOS'' and ``B-Cube'' are negative while other values are positive; on ``ECMP (multi-path)'', ``Binary Tree'', ``B-Cube'', ``D-Cell'' and ``Triangle'' are negative while other values are positive; ``ECMP (multi-path)'' $\times$ ``5-layered CLOS'' isn't the most significative value of the table, but ``Bipartite'' $\times$ ``LDR (multi-path)''.
Therefore, we also included the box plot figures \ref{fig:plsimpletree} to \ref{fig:pbprinciple}, which contains boxes and whiskers blots previously mentioned (which are redundant on table \ref{tab:boxplotoutliers} for readability).

\begin{table}[H]
  \caption{Outliers identified on tables \ref{tab:dpl} to \ref{tab:dpb} which are represented on figures \ref{fig:plsimpletree} to \ref{fig:pbprinciple}, highlighting line with different values}
  \label{tab:boxplotoutliers}
%%%% tab:dpl: Binary Tree, B-Cube, 3-layered CLOS, 5-layered CLOS, Triangle
%%%% tab:dpj: Binary Tree, B-Cube, D-Cell,         5-layered CLOS, Triangle
%%%% tab:dpj: Binary Tree, B-Cube, D-Cell,         5-layered CLOS, Triangle
  \begin{center}
    \resizebox{\linewidth}{!}{
    \rowcolors{1}{white}{black!10!white}
    \begin{tabular}{c|c|c}
      Latency & Jitter & Bandwidth \\
      \hline
       Binary Tree & Binary Tree & Binary Tree \\
       B-Cube & B-Cube & B-Cube \\
       \rowcolor{gray}
       {\color{white} 3-layered CLOS} & {\color{white} D-Cell} & {\color{white} D-Cell} \\
       5-layered CLOS & 5-layered CLOS & 5-layered CLOS \\
       Triangle & Triangle & Triangle \\
    \end{tabular}
    }
  \end{center}
\end{table}

The figures \ref{fig:algospfpl} to \ref{fig:algecmppb} are slices from the aforementioned 4-dimensional matrix, by algorithm and metric, containing the topologies on the horizontal axis, the values of each measurement on the vertical axis, grouped as a box and whiskers plot, also generated using matplotlib\cite{matplotlib}.
Such view will be relevant on discussion section.

Something we didn't mention on section \ref{sec:topo} is that our 5-layered CLOS (figure \ref{fig:clos5}) originally had 3 switches on each spine row.
The problem with that is that there are more paths in that topology than our computer had memory to represent them.
To solve that, we had to reduce it to 2 switches on each spine row.
This may suggest some limitations on routing algorithm applicability.

Something we noticed during the tests was that routing algorithm response time got slower as the topology had more paths available.
Such impact was perceived by watching the topologies being tested on our real-time visualizer, which took more time than 1 second to update the file that the visualizer refreshed 10 times per second.
Such observation suggested that ``MinMax (single-path)'' is slower than ``LDR (single-path)'', which is slower than ``LDR (multi-path)'', which is slower than all the others, which were imperceptibly fast.
This may be important for some applications with already-deployed topologies.

It's worth highlighting that every source value on table \ref{tab:rd} was measured with 2 second intervals between every observation of the controller state.
For example, if a routing algorithm took 4.1 seconds to get ready, the test would have registered it took 6 seconds.
With some variability on the measurements over the 30 samples, the average should be able to smooth those minor imperfections.

After mentioning all the data we obtained and behaviors we observed, we have content to discuss on section \ref{sec:dis}.

%%%% All
%%%% Boxplots
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.9\linewidth]{boxplots/tm_all_pl.pdf}
    \caption{Box and whisker plot of all latency samples from concurrent tests}
    \label{fig:bpl}
  \end{center}
\end{figure}

\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.9\linewidth]{boxplots/tm_all_pj.pdf}
    \caption{Box and whisker plot of all jitter samples from concurrent tests}
    \label{fig:bpj}
  \end{center}
\end{figure}

\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.9\linewidth]{boxplots/tm_all_pb.pdf}
    \caption{Box and whisker plot of all bandwidth samples from concurrent tests}
    \label{fig:bpb}
  \end{center}
\end{figure}

%%%% Ternary scatter plot
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.9\linewidth]{boxplots/tern_p_all.pdf}
    \caption{Ternary scatter plot of the union of all samples from all topologies}
    \label{fig:tern}
  \end{center}
\end{figure}

%%%%%% Figures
%%%% tab:dpl: Binary Tree, B-Cube, 3-layered CLOS, 5-layered CLOS, Triangle
%%%% tab:dpj: Binary Tree, B-Cube, D-Cell,         5-layered CLOS, Triangle
%%%% tab:dpj: Binary Tree, B-Cube, D-Cell,         5-layered CLOS, Triangle

%%%% tab:dpl: Binary Tree, B-Cube, 3-layered CLOS, 5-layered CLOS, Triangle
%% Binary Tree
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.85\linewidth]{boxplots/tm_simpletree_pl.pdf}
    \caption{Box and whisker plot of latency samples from concurrent tests on ``Binary Tree'' topology}
    \label{fig:plsimpletree}
  \end{center}
\end{figure}
%% B-Cube
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.85\linewidth]{boxplots/tm_bcubeswitched_pl.pdf}
    \caption{Box and whisker plot of latency samples from concurrent tests on ``B-Cube'' topology}
    \label{fig:plbcubeswitched}
  \end{center}
\end{figure}
%% 3-layered CLOS
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.85\linewidth]{boxplots/tm_clos_pl.pdf}
    \caption{Box and whisker plot of latency samples from concurrent tests on ``3-layered CLOS'' topology}
    \label{fig:plclos}
  \end{center}
\end{figure}
%% 5-layered CLOS
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.85\linewidth]{boxplots/tm_clos5_pl.pdf}
    \caption{Box and whisker plot of latency samples from concurrent tests on ``5-layered CLOS'' topology}
    \label{fig:plclos5}
  \end{center}
\end{figure}
%% Triangle
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.85\linewidth]{boxplots/tm_principle_pl.pdf}
    \caption{Box and whisker plot of latency samples from concurrent tests on ``Triangle'' topology}
    \label{fig:plprinciple}
  \end{center}
\end{figure}

%%%% tab:dpj: Binary Tree, B-Cube, D-Cell, 5-layered CLOS, Triangle
%% Binary Tree
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.85\linewidth]{boxplots/tm_simpletree_pj.pdf}
    \caption{Box and whisker plot of jitter samples from concurrent tests on ``Binary Tree'' topology}
    \label{fig:pjsimpletree}
  \end{center}
\end{figure}
%% B-Cube
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.85\linewidth]{boxplots/tm_bcubeswitched_pj.pdf}
    \caption{Box and whisker plot of jitter samples from concurrent tests on ``B-Cube'' topology}
    \label{fig:pjbcubeswitched}
  \end{center}
\end{figure}
%% D-Cell
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.85\linewidth]{boxplots/tm_dcellswitched_pj.pdf}
    \caption{Box and whisker plot of jitter samples from concurrent tests on ``D-Cell'' topology}
    \label{fig:pjdcellswitched}
  \end{center}
\end{figure}
%% 5-layered CLOS
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.85\linewidth]{boxplots/tm_clos5_pj.pdf}
    \caption{Box and whisker plot of jitter samples from concurrent tests on ``5-layered CLOS'' topology}
    \label{fig:pjclos5}
  \end{center}
\end{figure}
%% Triangle
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.85\linewidth]{boxplots/tm_principle_pj.pdf}
    \caption{Box and whisker plot of jitter samples from concurrent tests on ``Triangle'' topology}
    \label{fig:pjprinciple}
  \end{center}
\end{figure}

%%%% tab:dpb: Binary Tree, B-Cube, D-Cell, 5-layered CLOS, Triangle
%% Binary Tree
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.85\linewidth]{boxplots/tm_simpletree_pb.pdf}
    \caption{Box and whisker plot of bandwidth samples from concurrent tests on ``Binary Tree'' topology}
    \label{fig:pbsimpletree}
  \end{center}
\end{figure}
%% B-Cube
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.85\linewidth]{boxplots/tm_bcubeswitched_pb.pdf}
    \caption{Box and whisker plot of bandwidth samples from concurrent tests on ``B-Cube'' topology}
    \label{fig:pbbcubeswitched}
  \end{center}
\end{figure}
%% D-Cell
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.85\linewidth]{boxplots/tm_dcellswitched_pb.pdf}
    \caption{Box and whisker plot of bandwidth samples from concurrent tests on ``D-Cell'' topology}
    \label{fig:pbdcellswitched}
  \end{center}
\end{figure}
%% 5-layered CLOS
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.85\linewidth]{boxplots/tm_clos5_pb.pdf}
    \caption{Box and whisker plot of bandwidth samples from concurrent tests on ``5-layered CLOS'' topology}
    \label{fig:pbclos5}
  \end{center}
\end{figure}
%% Triangle
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.85\linewidth]{boxplots/tm_principle_pb.pdf}
    \caption{Box and whisker plot of bandwidth samples from concurrent tests on ``Triangle'' topology}
    \label{fig:pbprinciple}
  \end{center}
\end{figure}

%%%%%% Figures

%%% OSPF
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.8\linewidth]{boxplots/am_ospf_pl.pdf}
    \caption{Box and whisker plot of latency samples from concurrent tests on ``OSPF (single-path)'' algorithm}
    \label{fig:algospfpl}
  \end{center}
\end{figure}
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.8\linewidth]{boxplots/am_ospf_pj.pdf}
    \caption{Box and whisker plot of jitter samples from concurrent tests on ``OSPF (single-path)'' algorithm}
    \label{fig:algospfpj}
  \end{center}
\end{figure}
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.8\linewidth]{boxplots/am_ospf_pb.pdf}
    \caption{Box and whisker plot of bandwidth samples from concurrent tests on ``OSPF (single-path)'' algorithm}
    \label{fig:algospfpb}
  \end{center}
\end{figure}

%%% LDR s
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.8\linewidth]{boxplots/am_ldr-single_pl.pdf}
    \caption{Box and whisker plot of latency samples from concurrent tests on ``LDR (single-path)'' algorithm}
    \label{fig:algldrspl}
  \end{center}
\end{figure}
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.8\linewidth]{boxplots/am_ldr-single_pj.pdf}
    \caption{Box and whisker plot of jitter samples from concurrent tests on ``LDR (single-path)'' algorithm}
    \label{fig:algldrspj}
  \end{center}
\end{figure}
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.8\linewidth]{boxplots/am_ldr-single_pb.pdf}
    \caption{Box and whisker plot of bandwidth samples from concurrent tests on ``LDR (single-path)'' algorithm}
    \label{fig:algldrspb}
  \end{center}
\end{figure}

%%% MM
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.8\linewidth]{boxplots/am_minmax-single_pl.pdf}
    \caption{Box and whisker plot of latency samples from concurrent tests on ``MinMax (single-path)'' algorithm}
    \label{fig:algmmpl}
  \end{center}
\end{figure}
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.8\linewidth]{boxplots/am_minmax-single_pj.pdf}
    \caption{Box and whisker plot of jitter samples from concurrent tests on ``MinMax (single-path)'' algorithm}
    \label{fig:algmmpj}
  \end{center}
\end{figure}
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.8\linewidth]{boxplots/am_minmax-single_pb.pdf}
    \caption{Box and whisker plot of bandwidth samples from concurrent tests on ``MinMax (single-path)'' algorithm}
    \label{fig:algmmpb}
  \end{center}
\end{figure}

%%% LDR m
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.8\linewidth]{boxplots/am_ldr_pl.pdf}
    \caption{Box and whisker plot of latency samples from concurrent tests on ``LDR (multi-path)'' algorithm}
    \label{fig:algldrpl}
  \end{center}
\end{figure}
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.8\linewidth]{boxplots/am_ldr_pj.pdf}
    \caption{Box and whisker plot of jitter samples from concurrent tests on ``LDR (multi-path)'' algorithm}
    \label{fig:algldrpj}
  \end{center}
\end{figure}
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.8\linewidth]{boxplots/am_ldr_pb.pdf}
    \caption{Box and whisker plot of bandwidth samples from concurrent tests on ``LDR (multi-path)'' algorithm}
    \label{fig:algldrpb}
  \end{center}
\end{figure}

%%% ECMP
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.8\linewidth]{boxplots/am_ecmp_pl.pdf}
    \caption{Box and whisker plot of latency samples from concurrent tests on ``ECMP (multi-path)'' algorithm}
    \label{fig:algecmppl}
  \end{center}
\end{figure}
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.8\linewidth]{boxplots/am_ecmp_pj.pdf}
    \caption{Box and whisker plot of jitter samples from concurrent tests on ``ECMP (multi-path)'' algorithm}
    \label{fig:algecmppj}
  \end{center}
\end{figure}
\begin{figure}[H]
  \begin{center}
    \includegraphics[width=0.8\linewidth]{boxplots/am_ecmp_pb.pdf}
    \caption{Box and whisker plot of bandwidth samples from concurrent tests on ``ECMP (multi-path)'' algorithm}
    \label{fig:algecmppb}
  \end{center}
\end{figure}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Discussion} % The discussion section will explain the authors interpret their data and how they connect it to other work
\label{sec:dis}
Before we start, we should remember that all paths from all topologies had $1 mbps$ bandwidth, with ``Triangle'' being the only exception, but it had overcapacity.
This is the ideal scenario for the ECMP we implemented; not for LDR.
This suggests that another question very similar to \citeauthor{gvozdiev2018onlowlatency}'s question ``Does routing influence topology?''\cite[p. 99]{gvozdiev2018onlowlatency}, but worded as ``Does topology influence routing?''.
In no moment we expected LDR to be consistently faster, with lower latency and lower jitter than ECMP; and that's the strategy behind the tests: what attributes it gives up to become better at what else?

With $S$ as the set of switches and $H$ as the set of hosts to be routed, $\vert{S}\vert \times \vert{\powerset(H) \setminus \{(h,h) \forall h \in H\}}\vert$ is the sum of the number of entries of every switch of the network.
This number of entries can be feasible for testing few hosts on an emulated environment (our case).
However, our implementation is not suitable for production.

From the data we collected, we see that ``LDR (multi-path)'' gained, on the average case, advantage on bandwidth (figure \ref{fig:bpb}), but lost performance on latency and jitter (figures \ref{fig:bpl} and \ref{fig:bpj}) over other algorithms.
Even that this is not true for all cases, such as in figures \ref{fig:plsimpletree}, \ref{fig:plclos}, \ref{fig:pjdcellswitched}, \ref{fig:pbclos5}, this is the opposite of what we were expecting.
To understand why this happened, we need to remember that \citeauthor{gvozdiev2018onlowlatency}'s proposed routing algorithm tries, first, to eliminate congestion, then choose the path with lower latency.
Eliminating congestion on one overloaded link by sharing its load with multiple underused link sounds like an obvious solution.
The problem starts when you have more demand than capacity.
In such scenario, data shows that this algorithm will attempt keep link usage evenly balanced to keep bandwidth high, even if this means increasing latency and jitter.
If this gets combined with an latency-dependent application that generates more traffic when the latency requirements aren't met (like an hypothetical video chat application that tries compensating a higher delay by increasing the frame rate), a scenario for thrashing appears.
Such behavior, even if it happens under extremely rare and abnormal situations, will be more likely to happen over the time, as technology advances and new requirements are set, enabling DDoS\footnote{\textbf{D}istributed \textbf{D}enial \textbf{o}f \textbf{S}ervice} attacks to use even more bandwidth and new internet applications that make a more intense use of the available infrastructure.
The upcoming 5G technology (which have already arrived on few cities) enables low-latency and near $1 gbps$ mobile connections, while some parts of the infrastructure are still from 4G period; if adding such routing algorithm introduces the possibility of severe service degradation, it's understandable why ISPs wouldn't deploy it in real world.
Finally, this goes the will expressed on the end of \citeauthor{gvozdiev2018onlowlatency}'s discussion section: ``We hope our work can be a first step toward enabling the deployment of ISP topologies that are better than today’s for the provision of low-latency service [...]''\cite[p. 100]{gvozdiev2018onlowlatency}.

From the data we collected ``LDR (single-path)'' exchanges, on the average case, a slightly higher jitter for the average latency variability between measurements to reduce, while keeping the bandwidth nearly the same as ``OSPF (single-path)'', however .
As for an overloaded network, it's performing as expected: changing routes to attempt (in vain) to reduce latency.
However, it didn't perform completely as expected.
As soon as monitoring started inserting monitoring data, response time increased.
On topologies with high path diversity, it meant path changing stopped right after the first monitoring.
As the same thing happened to ``MinMax (single-path)'', it suffered from the same problems, but reducing latency while keeping jitter and bandwidth stable, but not as reduced as in ``LDR (single-path)''.
As the routing decisions are time-dependent, this diminished their performance; specially because each test individual test took 10 seconds.

The figure \ref{fig:tern} shows a dispersion of all collected data.
How the data organizes itself on that ternary plane shows how a incomprehensible mess looking at everything at once is.
Therefore the average used previously may be prone to oversimplification.
A way to don't oversimplificate, in this case, is looking at what topologies each routing algorithm works the best and ignoring the cases where the topology doesn't work well with the algorithm.

Figures \ref{fig:algospfpl} to \ref{fig:algecmppb} brings, then, the required performance information to assist a data center architect to decide which topology and routing algorithm to deploy; ``Principle'' should be ignored, as it was intended to be a proof of concept; ``D-Cell'' should also be ignored, as having 5 cells with 4 hosts each and dividing the hosts in half for testing gave that topology the advantage of having 2 pairs doing bandwidth tests in which both client and server were inside the same cell.
Looking at ``OSPF (single-path)'' (figures \ref{fig:algospfpl}, \ref{fig:algospfpj} and \ref{fig:algospfpb}), ``B-Cube'' brings high bandwidth at the lowest latency and jitter.
Looking at ``LDR (single-path)'' (figures \ref{fig:algldrspl}, \ref{fig:algldrspj} and \ref{fig:algldrspb}) and ``MinMax (single-path)'' (figures \ref{fig:algmmpl}, \ref{fig:algmmpj} and \ref{fig:algmmpb}), ``B-Cube'' brings high bandwidth at the lowest latency and jitter, but these last two suffer from variability in a way that it can perform worse than other topologies; if having a stable latency and/or jitter is a requirement that makes a loss on bandwidth around 27\% tolerable, ``Bipartite'' and ``CLOS-3'' are nearly-equivalent options.
Looking at ``LDR (multiple-path)'' (figures \ref{fig:algldrpl}, \ref{fig:algldrpj} and \ref{fig:algldrpb}), ``Bipartite'' brings the lowest latency and jitter with the highest bandwidth.
Looking at ``ECMP (multiple-path)'' (figures \ref{fig:algecmppl}, \ref{fig:algecmppj} and \ref{fig:algecmppb}), ``Bipartite'' brings the lowest latency and jitter, but a slightly higher bandwidth is found with ``CLOS-3''.
As this didn't take into account cabling costs, generated heat by equipment, cooling, etc, other factors may influence the final choice, but great benefits are expected if the routing algorithm is tuned for the network it'll route and vice-versa.

Finally, we can answer the questions made on the introduction of the article.
% But where?
The compromises were made on the case when there is no alternative path available and all paths are overloaded, but that may not be the case if the topology is similar to a ``3-layered CLOS''.
% Are there other limits which were not mentioned?
Such compromise, when present, becomes a limitation that makes it doesn't handle full-network overload well.
We hope that this article left, at least, an useful comparative testing code artifact for routing algorithms, which can be reused on future comparative works.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Acknowledgments} % The acknowledgments tell you what people or institutions (in addition to the authors) contributed to the work.
We thank CAPES for the financial support given during this research, Magnos Martinello for pointing us towards CLOS topologies, Maxwell Monteiro for pointing us towards ECMP, BCube and DCell, Marin Vlastelica Pogančić for the comprehensible tutorial on HackerNoon on how to use PuLP and Wildan Maulana Syahidillah for the explanatory multi-path routing tutorial using Ryu.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\printbibliography
\end{multicols}
\end{document}