Skip to main content

Throughput, Latency, and Pipelines: Diagnosis Of A Fallacy

Source here.

Latency is the time it takes for a job to complete from start to finish -- for example, if we're downloading a file from a server, we might define the latency of the download as the amount of time it takes from the initial request for the file to the time the last byte is received.

Throughput is a measure of how much can be completed in a given time.  Following the same example, how many files could we download in an hour?

What is the relationship between latency and throughput?  

It make take, again, 1 hour to download a file.  Notice already that we have a relationship between some amount of work that can be completed and time -- if the file is, say, 2 GB, it takes 1 hour to download 2 GB in our example.  What we really want to do here is generalize: how long does it take to download 10 GB?  How many GB can we download in ten hours or one minute?  Generalizing over time, we derive latency; generalizing over work completed, we derive the throughput.

So if we start with an observation of some relationship, deriving throughput and latency is only a matter of unit conversion.  I know I can download 2 GB in an hour, but I want to know how many bytes I download in one second -- or I have a specific job in mind and want to download a 100 GB game, so I want to know how long that will take.

Throughput and latency seem like inverses, however.  I can download 2 GB in an hour, and I consider the hour as my baseline for time and the GB as my baseline for work -- I start with the latency for the 2 GB job, and I want to know its throughput -- or how often a GB is downloaded.

It seems helpful to understand ratios, and what the inversion of a ratio really means -- notionally and not just symbolically.  I can say I download 2 GB in an hour and just as much that in an hour, I download 2 GB -- so that is all we really mean by the inversion.  But when you invert, your units change.  I said above that I consider the GB as my baseline for work, and the inversion no longer expresses a relationship in terms of my baseline.  So I normalize my baseline by solving the simple equation that asks, if I can download 2 GB in an hour, how much I can download in 1 hour -- and easily conclude I can download 0.5 GB.

But there's a fallacy in this reasoning.  The pipeline is provided as a counter-example.  Say that it takes a car 5 hours to be completed via a pipeline process with 5 steps.  Each step in the car's construction takes an hour.  But because of the structure of the system that builds the car, other cars are also being built as each part of this car is completed.  So the throughput in this case is not necessarily 0.2 cars an hour.  If the pipeline is full, the final step will run 5 times in an hour and 5 cars will be completed.  The throughput then is considerably more: 1 car an hour.

Where does the fallacy come from?  We fell into the fallacy, I think, by confusing a data-point for a more general relationship.  We might observe a car come out of a pipeline after 5 hours and then, through inversion and normalization, conclude that the next car will come in another 5 hours.  But if we observed the system, which we might suppose is now saturated, for another hour, we will observe another car come out.  The fallacy is not in the inversion itself but in confusing a data point for a relationship.  Likewise, if you are downloading several files at a time and notice that it takes a certain amount of time to download one file, you can't necessarily arrive at any general conclusion about the throughput of the downloader from the amount of time it took to download that one file.  In fact, during the time it took to download that one file, simply from the hypothesis that other files were also downloading, we know that the throughput is greater than the size of the file itself.

But perhaps it isn't quite right to say that throughput and latency are not inverses -- the point is rather that you can't generalize from a single case.  So if you observe that it takes 5 hours to build a car or 1 hour to download a file, you are not necessarily observing throughput of a system in the first place -- you are just observing how long it took the system to complete a particular job.  You don't know what else the system was doing at that time -- it all depends on how it works.

Comments

Popular posts from this blog

Getting Geodata From Google's API

The apps I'm going to be analyzing are part of Dr. Charles Severance's MOOC on Python and Databases and work together according to the following structure (which applies both in this specific case and more generally to any application that creates and interprets a database using online data). The data source, in this case, is Google's Google Maps Geocoding API.  The "package" has two components: geoload.py  and geodump.py .  geoload.py  reads a list of locations from a file -- addresses for which we would like geographical information -- requests information about them from Google, and stores the information on a database ( geodata.db ).  geodump.py  reads and parses data from the database in JSON, then loads that into a javascript file.  The javascript is then used to create a web page on which the data is visualized as a series of points on the world-map.  Dr. Severance's course focuses on Python, so I'm only going to work my way through ...

Compiling and Executing Java Files With -cp

I decided I was going to "man up" and figure out how to compile a java program with an external dependency from the command line instead of relying on an IDE-- the DOS command line, to be more specific. I ran into a few problems: 1.  The external dependency was given to me as a java file.  I experimented compiling it as a .jar, but I wasn't sure how to import a class from a .jar, so I ended up compiling it into a class. 2.  When I tried to run the file, I got an error saying that the class had been compiled with a different version of Java than my JRE.  The Internet told me to check my path variable for Java.  It sure looked like it was pointing to the latest JRE (and the same version of Java as my compiler).  I asked the Internet again and found the following command: for %I in (java.exe) do @echo %~$PATH:I I'm not exactly sure what the syntax of that magic command is (intuitively it's returning the path that executes when I run the "java" com...

Quick Find / Quick Union (Connected Nodes)

Setup This week I learned about the "Quick Find" or "Quick Union" algorithm. Imagine an NxN grid of nodes, some of which are connected by lines. A connection can be interpreted as accessibility: if two nodes are connected, you can get from one to the other. Every node is accessible to itself: to get where you already are, stay there. Also, If you can get from A to B, you can go back from B to A. And if you can get from A to B and from B to C, then you can get from A to C. As a consequence, the connection between nodes divides the grid into regions of mutually accessible nodes. You can travel from any node in a given region to any other node in that region -- but not to any nodes outside that region (exercise to reader -- proof by contradiction). The problem has two parts. First, find a way to represent this grid structure and the accessibility relation; second, use your schema to efficiently calculate whether two given nodes are accessible to each other. ...